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1. Introduction 

The development of microcomputers during this decade sadly 
resembles the development of the early computers in the 1950's 
- software and programming tools have failed to keep pace with 
hardware developments. A recent survey showed that the bulk of 
microcomputer programs were written in machine code. Intel 
therefore are to be congratulated in their attempt to break 
with tradition through the introduction of PL/M. This language, 
despite its deficiencies, permits a far greater programmer 
productivity (and, more important, program reliability) than 
previous software tools. Unfortunately the only PL/M compiler 
available to date has been a cross-compiler, which requires 
ready access to large and expensive computers. 

This report describes a resident compiler for the PL/M language 
for the Intel 8080-based microcomputer system MYCRO-1. The pro¬ 
ject was initially proposed to NTNF in the summer of 1975. After 
much discussion, NTNF finally granted A/S MYCRON kr 200.000.- to 
develop the compiler. MYCRON placed the development contract with 
Norsk Regnesentral in July 1976, and the compiler was completed 
by December 1976. 

The report was written jointly by the members of the implementa¬ 
tion team: 


0ystein Halvorsen 
Kari Johnsen 
Sigurd Kubosch 

Paul Wynn (project leader) 

The proposal in Appendix A is the work of Bj0rn Myrhaug, without 
whom this project (and many others) would never have come into 
being. 
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2. Objectives 

The following requirements were placed upon the resident PL/M 
compiler: 


it should accept standard PL/M 

it should generate efficient, space-saving code 

the total size should not exceed 32K bytes 

it should be possible to generate relocatable 
object code. 

In the light of experience gained from previous compiler 
projects at NR, especially PL/MIPROC, it was decided to write 
the compiler as a single pass, using the recursive descent 
technique. PL/M itself was considered unsuitable as an , 
implementation language, being non-recursive and of course 
only available on large computers. Assembler was the only 
alternative as an implementation language. 
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3. Compiler structure 

The logical structure of the compiler closely resembles the 
syntactical structure of PL/M. In principle there exists one 
semantic subroutine for each non-terminal symbol in the PL/M 
grammar. This subroutine is responsible for matching the input 
stream against the right-hand side of the grammatical rule 
concerned, simultaneously generating the required object code. 
Terminal symbols are extracted directly from the input stream 
by a scanner co-routine (SCANSYM, see Section 4), and non¬ 
terminals are resolved by calling the appropriate semantic 
routine. 

Consider, for example, a derivation of the rule 

<ifstatement>:;= IF<expression>THEN<statement> 

Here, the subroutine IFSTATEMENT will call SCANSYM to extract 
IF from the input stream, and then EXPRESSION to scan <expression> 
and generate the associated object code. A "jump-on-false" in¬ 
struction is then output, whose destination address is unknown 
at this point. SCANSYM is called to extract THEN, and STATEMENT 
is called to scan <statement> and output the object code for 
this. Finally the "jump-on-false" destination address is fixed-up. 

The analysis is therefore top-down. One disadvantage with 
this approach is that code optimisation becomes very difficult 
- communication between different invocations of the various 
semantic procedures is practically impossible. Therefore the 
recursive descent analysis is not persued down to the lowest 
level: code for assignments and expressions (including subscripts) 
is generated in a bottom-up fashion using the conventional stack 
and incoming symbol technique (cf. Randell and Russell, Algol 60 
Implementation), which permits local optimalisation within a 
one-pass framework. 
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4. The scanner 

The main function of the scanner is to transform the input 
program into a stream of basic PL/M symbols and feed these 
one by one to the code-generation routines. The scanner is 
in principle a coroutine, and is operated from without by 
two routines, SCANSYM and SCANBACK. A call on SCANSYM returns 
the next basic symbol, whereas .SCANBACK forces the next call 
on SCANSYM to return the last basic symbol once again, thus 
providing a one-symbol.lookahead mechanism. 

The scanner also handles the macro ("LITERALLY") facility by 
substituting every macro call with its associated text value. 

Whenever an unknown identifier symbol is encountered in the 
source text, it is stacked onto the identifier symbol table. 

* 

The scanner consists of four main parts: 

a routine for reading single input characters 
and building up PL/M basic symbols (SCAN) 

a routine for classifying symbols and looking 
them up in symbol tables (SCANSYM) 

a macro-handler 

a set of symbol tables. 

4.1 SCAN 

The routine SCAN extracts characters from the input stream by 
calling the subroutine SCNIN. A call on the latter routine re¬ 
turns either the next source input character, the next character 
from a macro, or sets the carry bit to indicate that the input 
stream has been exhausted. The source text is considered as a 
continuous character string, with each CR/LF replaced by a 
space, and where every macro call is replaced by its associated 
macro string (from the relevant "LITERALLY" declaration). 
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SCAN may be forced to deliver the next single character by 
setting the global variable SINGLE. This function is used 
whenever one of the symbols '<'? or '/' is met, 

as these symbols may be parts of compound symbols (:=, >=, 
<=, <>, /^comment*/). If SINGLE is not set, a call on SCAN 
returns the next PL/M terminal symbol. 

The routine starts by scanning input characters until a 
non-blank is met. This is then classified as a quote (')f 
a letter, a digit, a special symbol or end-of-file (eof). 
Note that the Norwegian characters 'ffi', '0', 'A' and the 

underline character are classified as letters, and also 

that upper and lower case letters are not distinguished from 
each other. A special symbol or eof is returned directly 
from SCAN, otherwise a buffer is built up containing the 
string, identifier or number. SCAN returns both the basic 
symbol and its type. 

4.2 SCANSYM 


SCANSYM returns a mnemonic value for each basic input symbol 
(S-value). If the next symbol is a number, then this is 
converted to its binary value. If it is an identifier (name), 
it is further classified as a reserved identifier (e.g. "IF"), 
a user-defined identifier (a variable or procedure name, a 
label, etc.) or as a predeclared identifier (e.g. "TIME"). 

If the identifier is not found in the symbol tables, it is 
stacked onto the user-defined identifier table. 

If the character sequence '/*' is found, then input characters 
are scanned until the sequence '*/' is found (a comment), and 
the normal scanning process is resumed. 

If a user-defined identifier has been declared as a macro, 
then the macro mechanism is invoked to replace the identifier 
with its associated macro string, and scanning is resumed 
with input characters now being taken from this string. 
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All compiler commands (except EOF) start with the character 

A special routine RDCOM is called to analyse and execute 
the command whenever a is returned from SCAN. 

4.3 Processing of macros . 

Macros are declared by a DECLARE LITERALLY statement; any 
valid occurrence of the macro identifier causes a textual 
substitution of the macro name by the associated macro string. 
This is achieved in SCANSYM by a call on the routine MACRO. 
MACRO forces SCNIN to deliver characters one by one from the 
macro string, until the string is exhausted. SCNIN is then 
set to resume scanning the input program. 

When macros are called recursively, pointers are generated 
and stacked to keep account of the current position within 
each macro string. A macro should, not call itself. 

4.4 The symbol tables . 

Three name tables are used for PL/M symbols, of which two 
are static and one is dynamic, the static tables being those 
for reserved identifiers (RESWD) and predeclared identifiers 
(PREDECL). The dynamic table (IDS) is used for stacking user- 
defined identifiers and block markers. These tables contain 
all identifiers visible from the current statement in the 
input program. Whenever a valid 'END' is encountered in the 
source program, all user-declared identifiers belonging to 
the associated block are unstacked, allowing reuse of this 
part of the symbol stack area. 

A set of routines is provided to handle these tables - i.e. 
to look up, insert and remove symbols and block markers. 
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User-defined identifier entries have the following format: 
<identifier name> <type> <address> <various> <level> 

Reserved and predeclared identifier entries have a similar 
format: 

<identifier name> <type> 

<type> indicates the type of identifier (BYTE, LABEL, PROCEDURE 
etc.) and <address> indicates the location of the variable or 
program address. For macros, <address> contains the location 
of the string at compile time (i.e. within the string buffer). 
<various> gives for undefined GOTO's a reference to the chain 
of addresses which must be fixed up when the label address is 
defined, for variables the number of elements (length), and 
for procedures the number of parameters and condition flags. 
<level> gives the level number of the block where the identi¬ 
fier is defined (main program = level 0). 

Block marker entries have the same format as user-defined 
identifier entries, where <identifier name> contains the 
complement of the level of the block, and <various> contains 
the number of pushes generated on the runtime stack (this is 
used when leaving a block by "RETURN" or "GOTO"). <level> is 
used such that it is possible to look up identifiers at 
lower levels. 
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5. Operand classification - PRIMARY 

This part of the compiler scans and classifies the various 
kinds of operands in expressions. It consists of the 
following routines: 

PRIMARY: recognizes identifiers, constants, string 
operands, and dotted operands. It is called by the 
expression/assignment handler (cf. section 6). 

IDPRIM: classifies an identifier and generates the 
necessary code. It is called by PRIMARY, by the 
expression/assignment routines, and by CALLSTATEMENT. 

routines for processing predeclared identifiers: 
these routines are called by IDPRIM when it has 
found a predeclared function. 

auxiliary routines. 

All these routines generate code as then scan the input text. 
If they find erroneous input, the normal recovery action is 
to substitute the predefined operand "MEMORY". 

An overview is given in figure 5.1 below: 



EXPRESSION/ASSIGN 


CALLS t; 


♦ 



Fig. The Structure of PRIMARY 
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5.1 PRIMARY 

PRIMARY is called by the expression/assignment handler, 
when this expects another operand. PRIMARY returns values 
in registers D, E, H, L, which describe the operand found, 
and takes the following actions depending on the next symbol 
given by the scanner: 

identifier: IDPRIM is called. 

number (constant) : the nijmber is classified as either 
a byte or address constant. 

string: strings are considered as constants. When 
the string erroneously is empty then byte constant 
zero is returned. If the string consists of only one 
character then the ASCII value of it is returned as 
a byte constant. Otherwise the two first characteifs 
are taken and its ASCII equivalent is returned as 
either a byte or address constant. If the string 
contains more than two characters, it is truncated, 
and an error message is given. 

dot: the next input symbol is scanned. If this is 
a string, n\amber or left parenthesis, the routine 
LITERAL is called. In the case of an identifier, 

IDPRIM is called. Otherwise an error message is 
given and MEMORY is returned. If PRIMARY does not 
find any operand symbol, then an unary operator is 
assumed and the carry flag is set on return, 
otherwise the carry flag is reset. 

NBl Indexing is done in the assign/expression routines. 
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5.2 IDPRIM 

This routine has one input parameter: a reference to the 
symbol table, i.e. the description of the identifier found. 
IDPRIM then decides what to do by means of the information 
found in the symbol table. The following actions may be taken 

undefined variables or unspecified parameters: 

after having given an error message, MEMORY is returned. 

byte, address or based variables, data, structure, 
dotted label, or MEMORY: the operand description 
according to the symbol table is returned. 

procedure identifiers: routine PCALL is called which 
processes the procedure call, and a description of the 
procedure result is returned. In the case of a notype 
procedure or interrupt, MEMORY is returned and an error 
message is given. 

predeclared identifiers: the appropriate code is 
generated, either directly or via a subroutine, and 
the result description is returned. 

other identifiers: for such illegal identifiers 

(e.g. undotted labels) an error message is given, and 

u 

MEMORY is returned. 

NOTE: IDPRIM processes procedure calls, but does not handle 
indexing. The index scanning and evaluation is done in the 
expression handler. 

For such predeclared identifiers , IDPRIM generates code 
directly and calls no subroutine. These identifiers and 
actions are: 

STACKPTR: A special operand description is returned. 

TIME: an error message is given, and MEMORY is returned. 
The parameter to TIME will be taken as index to MEMORY. 

carry : the instruction "SBB A" is generated, and the 
operand description classifies the operand to be of type 
byte in register A. 
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5.3 Routines for processing predeclared identifiers . 

5.3.1 LAST/LENGTH/SIZE 

The routine LALEN has 2 input parameters: 

1 for LAST, 0 for LENGTH, -1 for SIZE 

a reference to the symbol table for either LAST, 

SIZE or LENGTH, used in a possible error message. 

LALEN generates no code. It scans the parameter through the 
procedure REMOTE and returns the value taken from the symbol 
table for the identifier found. Error recovery result is given 
by routine DULAL: 1 for LENGTH, 0 for LAST and 2 for SIZE. 

5.3.2 OUTPUT 

The routine SCOUT scans the parameter number of OUTPUT and 
returns a special operand description. SCOUT has the symbol 
table reference for output as parameter; it is used in a 
possible error message. No code is produced. 

5.3.3 LOW/HIGH 

The routine SLOHI processes calls on LOW or HIGH. It has 
2 parameters: 

true for LOW, false for HIGH 

a reference to the symbol table for either LOW 
or HIGH; used in a possible error message. 

SLOHI generates the appropriate loading into register A, as 
follows: 

LOW (<address expression>): 

<ADREXP> (evaluate address expression to DE) 

MOV A,E 

HIGH (<address expression>): 

<ADREXP> (evaluate address expression to DE) 

MOV A,D 
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5.3.4 DOUBLE 

The routine DOUBLE generates the appropriate loading of 
the register pair DE, and returns the resulting operand 
description. The parameter is a reference to the entry 
for DOUBLE in the symbol table and is used in a possible 
error message. 

The code generated for DOUBLE (<byte expression^) is: 

<BYTEXP> (evaluate byte expression to A) 

MVI D,0 
MOV A,E 

5.3.5 INPUT 

The routine SCINP scans the parameter, generates the 
appropriate IN instruction and returns the A-register 
description. SCINP uses the input parameter, a reference 
to the input entry in the symbol table, in a possible 
error message. The code generated for INPUT (<byte constant>) 
is: 

IN <byte constant> 

5.3.6 DEC 

The routine SCDEC scans thru the parameter by calling EXP, 
and generates the appropriate decimal adjust instruction(s). 
The input parameter, a reference to the DEC entry in the 
symbol table, is used in a possible error message. 

The code generated for DEC is as follows: 

DEC (<byte expression>): 

<BYTEXP> (evaluate byte expression to A) 


DAA 
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DEC {<address expression>): 

<ADREXP> (evaluate address expression to DE) 

MOV A,E 
DAA 

MOV E,A 
MOV A,D 
ACI 0 
DAA 

MOV D,A 

5.3.7 PARITY/SIGN/ZERO 

The input parameter to the routine PROFL is the appropriate 
jump instruction: 

JPO for PARITY 
JP for SIGN 
JNZ for ZERO 

PROFL generates the appropriate code below, and returns the 
A-register description. 

PARITY: 

MVI A,0 
JPO 
CMA 

SIGN: 

MVI A,0 
JP 
CMA 

ZERO: 

MVI A,0 
JNZ 
CMA 
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5.3.8 ROL/ROR 

The routine ROTATE has 2 parameters: 

the instruction code: RLC for ROL 

RRC for ROR 

the symbol table reference for ROL og ROR, 
which is used in a possible error message. 


ROTATE generates the code necessary to perform the cyclic 
shift of the parameter expression, as follows: 

ROL (<byteexp 1>, <bytexp 2>): 


<BYTEXP> 
PUSH PSW 
<BYTEXP> 
MOV E,A 
POP PSW 
RLc"**” 
DCR E 
JNZ - 


ROL (<bytexp 1>, <bytexp 2>): 


(evaluate bytexp 1 to A) 
(evaluate bytexp 2 to A) 


<BYTEXP> 
PUSH PSW 
<BYTEXP> 
MOV E,A 
POP PSW 
RRC 
DCR E 
JNZ - 


(evaluate bytexp 1 to A) 
(evaluate bytexp 2 to A) 
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5.3.9 SCL/SCR/SHL/SHR 

The routine SHIFT has 2 parameters: 

a code: 

1 for SCL 

2 for SCR 

3 for SHL 

4 for SHR. 

a reference to the symbol table entry for either SCL, 

SCR, SHL or SHR for-use in a possible error message. 

SHIFT scans the parameter and calls 

BSHIFT for byte rotation 

ASCL for SCL of address expression 

ASCR for SCR " 

ASHL for SHL " 

ASHR for SHR " " " 

These routipes generate the necessary code given below, except for 
some PUSH or POP instructions. 

2 >): 

(evaluate bytexp 1 to A) 

(evaluate bytexp 2 to A) 


SCL (<bytexp 1>, <bytexp 

<BYTEXP> 

PUSH PSW 
<BYTEXP> 

MOV E,A 
POP PSW 

raIT 

DCR E 
JNZ — 
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SCR (<bytexp 1>, <bytexp 2>): 

<BYTEXP> (evaluate bytexp 1 to A) 

PUSH PSW 

<BYTEXP> (evaluate bytexp 2 to A) 

MOV E,A 
POP PSW 
RAR 
DCR 
JNZ 
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SCR (<adrexp>, <bytexp>): 


<ADREXP> (evaluate adrexp to DE) 

PUSH D 

<BYTEXP> (evaluate bytexp to A) 

POP D 


LXI H,ASCRX 
PUSH H 

ASCR: MOV 

MOV 
RAR 
MOV 
RAR 
MOV 
MOV 
RAR 
MOV 
OCR 
JNZ 
RET 


CALL ASCR 

L,A 

A,E 

A,D 

D, A 
A,E 

E, A 
L 


subroutine ASCR, 
generated in-line 
first time SCR is 
encountered, called 
as procedure on 
on subsequent 
references to SCR 


ASCRX; 

SHC (^bytexp 1>, <bytexp 2>): 

<BYTEXP> (evaluate <bytexp 1> to A) 

PUSH PSW 

<BYTEXP> (evaluate <bytexp 2> to A) 

MOV E,A 
POP PSW 
ADD^A 
DCR F 
JNZ - 


SHC ( adrexp , bytexp ): 


ADREXP 
PUSH D 
BYTEXP 
POP D 
XCHG 
DAD IT* 
DCR A 

JNZ — 


(evaluate adrexp to DE) 
(evaluate bytexp to A) 


NB: Result is in HL'. 
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SHR (<bytexp 1>, <bytexp 2>): 


<BYTEXP> 

(evaluate 

bytexp 

1 

to 

A) 

PUSH PSW 






<BYTEXP> 

(evaluate 

bytexp 

2 

to 

A) 

MOV 

E,A 






POP 

PSW 






ORA 

A 






RAR 







DCR 

E 






JNZ 








SHR (<adrexp>, <bytexp>): 


<ADREXP> 
PUSH D 
<BYTEXP> 
POP D 


LXI H,ASHRX 
PUSH H 

ASHR; MOV L,A 
ORA A 
MOV A,D 
RAR 

MOV D,A 
MOV A,E 
RAR 

MOV E,A 
OCR L 

JNZ- 

RET 


(evaluate adrexp to DE) 
(evaluate bytexp to A) 


- \ 

CALL ASHR 


subroutine ASHR, 
generated in-line 
first time SHR is 
encountered, called 
as procedure on sub¬ 
sequent references 
to SHR 


ASHRX; 



5.3.10 INCHAR 


The routine ZINCHAR generates a call to either the MYCROP 
routine TTIO or TTI. If 'INCHAR' is followed by 'NOT' then 
a call to the routine TTI is generated, otherwise TTIO is 
taken. INCHAR is not part of the original PL/M specification, 
but it gives the PL/MYCRO user a useful input tool. ZINCHAR 
has one parameter, the reference to the symbol table entry 
for INCHAR, which is used in a possible error message. 

5.3.11 OUTCHAR 

This routine processes' the parameter to OUTCHAR by calling 
the routine BYTPAR, which in turn calls BYTEXP. A call to 
the MYCROP routine TTO is generated. OUTCHAR is not either 
part of the original PL/M specification. OUTCHAR has one 
parameter, the reference to the symbol table entry for 
OUTCHAR, which is used in a possible error message. 

NBl OUTCHAR Is a type procedure. 


5.4 Procedure call - PCALL 

This routine is called on by IDPRIM and by CALLSTATEMENT. 
PCALL scans a call on a declared procedure, either internal 
or external, and generates the necessary code. 

The errors detected by PCALL are: 

illegal recursive call 
incorrect number of parameters 
syntactical error in the call text. 

As recovery PCALL "zeroes" those parameters which are 
missing. 





5.5 TIME 


This routine is called when CALLSTATEMENT finds the 
redeclared identifier TIME. The parentheses and the 
parameter byte expression is scanned and evaluated. 

TIME generates the code sequence for a loop which will 
consume the desired amount of time: 

MOV A, <byte value> 

MVI D,12 
MOV eTd 

dcr’"e~| 

JNZ - 

OCR A 
JNZ-^ 

5.6 Utility routines 

TSTUNDEF: tests whether the input type is undefined or 

unspecified parameter, and calls UNDID if this 
is the case. 

PSADR: outputs the address of an identifier to the 

object code file. 

PPADR: computes an program address by means of the current 

location and input displacement, and outputs this 
address to the object code file. 

STORP: scans off input to the appropriate right parenthesis 

(RP); or to some terminating symbol (e.g. semicolon); 
the RP is scanned off, other terminals are not. 

Note, that left-right-parenthesis pairs are scanned 
off and will not terminate the procedure. If the 
RP was not found on termination, the carry flag is 
set on return, otherwise carry is reset. 
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NXSRP: 


BYTPAR: 


NEWPAR; 


BEXOP: 


AEXOP: 


ASCHR: 


scans the next input symbol if it is a right parenthesis, 
otherwise it calls MISRP and sets carry on return. 

scans the input text for: 

left parenthesis (LP) 
byte expression 
right parenthesis. 

If an LP is not found, then MISPAR is called and 
carry is set on return. 

generates either a call on a compiler generated, 
predeclared procedure, or the first entry to it. 

In the latter case the number of bytes necessary 
for the procedure is an input parameter. NEWPR 
checks the global table RTPROC element denoted by 
the type of the predeclared identifier. If it con¬ 
tains an address (other than zero), then the call 
is generated. Is there no address, then this is 
the first entry, and the new address is inserted 
into the RTPROC element, the code for the first 
entry is generated, and carry is set on return. 

gives the operand description for a byte expression 
in register A. 

gives the operand descripiton for an address 
expression in register pair DE. 

is a common subroutine to ASCR and ASHR, as these 
shift routines have nearly the same code sequence. 

The common code is generated. 


ASHL: 


generates the code for SHL (address expression, ...). 
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ASCL: generates the code for SCL (address expression, ...). 

The code is, however, only generated once; for later 
occurences NEWPR will generate a call. 

ASCR; generates the code for SCR (address expression, ...). 

The code is only generated once; for later occurences 
NEWPR will generate a call. ASCR calls upon ASCHR to 
generate some code. 

ASHR: generates the code for SHR (address expression, ...). 

The code is only generated once; for later occurences 
NEWPR will generate a call. ASHR calls upon ASCHR to 
generate some code. 

REMOTE: scans a variable reference: 

<identifier>[. <identifier>]... [.<identifier>] 
and returns size and length values. 

5.7 Error routines . 

MISCOM: gives the error message "MISSING , ". 

MISPAR: gives the error message "MISSING PARAMETER TO 

PROCEDURE". 

ILLPAR: gives the error message "ILLEGAL PARAMETER TO 

PROCEDURE". 

MISRP: gives the error message "MISSING ) AFTER PARAMETERS". 

ERMEM: called in error recovery to return the operand 

description for ADDRESS. 

DULAL: called in error recovery to return the operand 

description for either constant 1 (LENGTH), 
constant 0 (LAST) or constant 2 (SIZE). 



24 


5.8 STRUCTURE 


The procedure STRUCTURE processes the following: 

SI(expl).S2(exp2).Sn(expn) or a similar construction 

preceded by a dot. SI..Sn-1 are structure identifiers 
(based or not). Sn is of type byte or address (based or not ), 
or of type structure (based or not) if Si is preceded by a 
dot. 

Expl..expn are expressions (indices), they may be omitted. 

The processing from left to right proceeds as follows: 

1. The addresses of the Si's are accumulated in ACC. 

As long as they are non-based, no code will be gene¬ 
rated. At the end of the construction, or when a 
based Si is met, an instruction to load HL with ACC 
will be generated. 

# 

2. The expi's are computed (code generated to compute 
them) as they .are met. Each must be multiplied by 
the SIZE of the corresponding structure. If SIZE > 8, 
a call on the multiplication routine is generated, 
otherwise this is done by adding the expression SIZE 
times. If a structure address has been loaded (see 1) 
the SIZE*index is added to it, otherwise the index is 
pushed. When at last a structure address is loaded, the 
index is popped and added to it. 

Input: HL = IDREF for SI 

D = 0 if Si is preceded by a dot 
D = 1 if Si is not preceded by a dot 

Output: HL and DE defines the resulting operand. 

HL = value 
D = mode 

E = location def. or stres (structure result). 


See special table. 




25 


Variables used by procedure STRUCTURE 


S 

( MBREF in the assembly code) 

This is the address in the symbol table of the structure 
currently being processed. The parts of the symbol table 
entry may be denoted S.ADR, S.NEXT, etc. 

DNO 


Holds the value of SIZE for the current structure, ): the 
number of 'DAD D's to be generated after index processing. 

If the index is pushed at'runtime, DNO is pushed at compile 
time, so that the correct number of 'DAD D's may be generated 
when the addition finally takes place. 

FOUND 


Boolean variable that indicates whether a new structure- 
member has been found after the last index (or member) 
scanned. Set by SCANMEMBER. 

DOT 


Shows whether the outermost structure was preceded by a 
dot. The value is the input value in the D-register. 

ACC 

(BC-registers in the assembly code) 

Holds the accumulated addresses of the structures processed. 

STATE 

Possible values: 1-5. Used to keep track of the 'state' of 
the structure address in ACC. 
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The states are: 

1: No index and no based structure has been met. 

ACC holds the absolute address accumulated. No 
code has been generated. This is the initial state. 

2: No index met. A based structure has been met, but 

after this, no structure with relative address 7 ^ 0 . 

ACC holds the absolute address of the base of the 
structure. No code has been generated. 

3: Index(es) has been processed, pushed if more than one. 

No code to load ACC has been generated. ACC thus holds 
the absolute address accumulated. 

4: Code to load ACC and possibly to add indices, has been 

generated. The resulting address at runtime is in DE. 
ACC holds relative structure addresses that have been 
met after this code generation. 

5: As 4, except that the runtime address is in HL. 
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STRUCTURE - output In DE/HL 


TYPE of last structure (Sn) 


STATE 


1 

/DOT 


tbyte/tbbyte 

taddress/tbaddress 


ERROR, then 

D = mbid 

D = maid 

DOT = 1 

output as 

E = dseg 

E = dseg 


for byte 

HL = ACC 

HL = ACC 

STATE = 1 

As for byte 

D = mbid 

D = maid 

DOT = 0 


E = dseg 

E = dseg 



HL = ACC 

Hi = ACC 

STATE = 2 

ERROR, then 

D = mbbid 

D = mbaid 

DOT = 1 

output as 

E = dseg 

E = dseg 


for byte 

HL = ACC 

HL = ACC 

STATE = 2 

As for byte 

D = mbbid 

D = mbaid 

DOT = 0 


E = dseg 

E = dseg 



HL = ACC 

HL = ACC 

STATE = 4 

ERROR, then 

gen 'XCHG' 

gen 'XCHG' 

DOT = 1 

output as 

then, as for 

then as for 


for byte 

STATE = 5 

STATE = 5 

STATE = 4 

As for byte 

D = mdereg 

D = mdereg 

DOT = 0 


E = stres 

E = stres 



HL = 0 

1 

1 . 

HL = 0 


ERROR, then 

D = mbadr 

D = maadr 

gH| 

output as 

E = stres 

E = stres 

■B 

for byte 

o 

11 

HL = 0 

STATE = 5 

As for byte 

D = mhlreg 

D = mhlreg 

DOT = 0 


E = stres 

E = stres 



HL = 0 

HL = 0 
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6 . Expression and assignment handling . 

6 .1 Overview 

The following routines read and generate code for; 
a PL/M assignment - ASSIGNMENT 

a PL/M expression - EXP, BYTEXP, ADREXP, LOGEXP 

EXPSAVE is called from PRIMARY (see Section 5) to save 
expression registers, when PRIMARY has to generate code. 

Assigment and expression processing terminates on encountering 
the following symbols: 

1. An operand (number, identifier etc.) following 
another operand. 

# 

2. An operator not belonging to <assignment> or 
<expression> (see PL/M syntax) 

3. Comma, for expressions, and for assignment after = . 

4. A non-matched ) for expressions. 

SCANBACK (see Section 4) will always be called before return, 
so the symbol that caused the return will be read again. 

Very little is done at the five different entry points, before 
control goes to a common processing routine. In the following, 
'exp-processing' will mean this common processing. 

The exp-processing may roughly be divided into; 

A: Parsing and syntax checking. (Figure 1) 

B: Code generation. 
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A: Since parentheses are allowed, and since the operators 

have different priorities, operands and operators must 
be stacked to ensure correct order of execution. A compile¬ 
time stack (CTS, see section 6.2.1) is used for both 
operands and operators. The priority between operators is 
defined in the table XCGC. Figure 1 shows the main flow 
of this processing. 'Stack' and 'unstack' refer to CTS. 

The structure of the main tables used is shown in figure 2. | 

I 

The processing proceeds as follows: j 

I 

The EXP-entries call a common routine lEXP that | 

processes the expression down to a single operand. | 

I 

For assignments, 'begin assign' is stacked, and the 
common processing entered. 

Operands are read by PRIMARY. They are always stacked 
immediately. , 

Operators are read by SCANSYM. Via the table XCGA and 
the routine EXPSYM they are converted to the internal 
(for exp-processing) representation of the symbol. 

This is the absolute address of the symbol's entry in 
the table XCGB. Note that when SCANSYM is called, an 
operator is expected. Should the symbol be an identifier, 
constant etc., it is transformed (via XCGA) to the 
internal symbol 'end expression'. This is a common 
internal representation of all symbols that will 
terminate exp-processing. 

An operator is not immediately stacked, as the operands 
are. It is compared to the operator currently on top of 
the CTS. Possibly this operator is unstacked, and code 
generated for it. The comparison is repeated with the 
operator now on top of CTS, until an operator is met 
(on CTS) which causes the incoming operator to be stacked. 

This 'comparison' is done by the routine ACTION, which 
simply uses the table XCGC (and XCGCJ) as a 'jump-table', 
with the top-stack operator and the incoming operator as 
indeces. 
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Processing ends when an 'endexpression' is met. This symbol 
unstacks all operators on CTS down to 'begin expression' or 
'begin assign' and then goes to routines which terminate 
the processing. 

B: The compiler may generate normal (non-reentrant) or 

reentrant code. This is decided by a flag set by the 
user in the $R-command. This flag is tested 4 places 
in the expression/assignment handling: 

1. In XPLOPN/XPROPN, where the operand mode for 

identifiers are changed if reentrant code is to 
be generated. 

2-3. In the routines SAVE (via SAVR/PSHR) and RELTYP, 
which handle the runtime saving of registers. 

# 

4. In OUTBC in STRUCTURE, where code to load an 
operand into HL is generated. 

The reentrant code uses BC as base register. 

The codegenerator has 3 entry-points, GENOPT, GENASS 
and GENLDR, for operators, assignments and final 
loadinstructions respectively. 

The code to be generated is partly taken from a large 
table, the CODE BYTE TABLE, partly it is generated ad hoc. 
Access to the CODE BYTE TABLE is via the ACCESS TABLES 
TO CODE BYTES. 

Indeces in these tables are the operand modes. 

The 3 routines use different access tables. 



GENLDR 
GENASS 

GENOPT 


generates a 'load right operand' 

generates a 'load right operand' 
and a 'store in left operand' 

generates a 'load left operand' 
and a 'load right operand' 
and code for the operator. 



BYTEX: 
ADREX: 
LOGEX: 
ASSIGN: 

STC2: ■ 


lEXP: 

INSYM; 'begin expr' 
or 'begin assign' 


stack 

INSYM 



call PRIMARY 


operand 



’RIMARY- 

^result 


operator 


call SCANBACK 


(this call is 
actually in 
PRIMARY) 


XPOPT: 


call SCANSYM 
INSYM = new symbol 
routines: SCANSYM,EXPSYM 
tables: SCGA,XCGB 



Fig. 2: Main flow in expression processing, 














Main tables in expression processin 




routines to gene¬ 
rate operator code 











The table XCGC 


TOP- 

STACK 

OPERATOR 
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MOD 

+ 

PLUS 

MINUS 



INCOMING 

<> 
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< <= NOT 

OPERATOR 

AND 

OR 

XOR 

• 

* 

( 

/ 

) 

AEEX 

* / MOD 

TRMOP 

TRMOP 

TRMOP 

TRMOP 

TRMOP 

XR4 

TRMOP 

TRMOP 

XR4 

STKLP 

TRMOP 

TRMOP 

TRMOP 
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MINUS 

STACK 

AROP 

AROP 

AROP 

AROP 

11 

AROP 

AROP 

XR4 

TI 

AROP 

AROP 

INCOP 

- 

ir 

MINOP 

MINOP 

MINOP 

MINOP 

II 

MINOP 

MINOP 

II 

TI 

MINOP 

MINOP 

DECOP 

= 

ri 

STACK 

STKMI- 

XR4 

XR4 

11 

EQOP 

EQOP 

11 

11 

EQOP 

EQOP 

EQOP . 

<> <= . . . 

n 

ir 

TI 

M 

II 

11 

RELOP 

RELOP 

11 

It 

RELOP 

RELOP 

RELOP 

NOT 

n 

II 

n 

STACK 

STACK 

IT 

UNOP 

UNOP 

II 

TT 

UNOP 

UNOP 

UNOP ‘ 

CO 

AND 

ir 

ti 

II 

n 

II 

STKUN 

BINOP 

BINOP 

11 

It 

BINOP 

BINOP 

BINOP 

OR XOR 

ir 

ri 

If 

11 

It 

If 

STACK 

11 

11 

11 

IT 

ir 

11 ' 

; = 

rt 

n 

11 

11 

IT 

IT 

11 

STACK 

11 

11 

IAS OP 

IAS OP 

lASOP 

( 

ti 

11 

11 

ri 

n' 

M 

II 

If 

STKA 

11 

XR6 

UNPAR 

XR6 

9 

XR12 

XR12 

XR12 

STKAS 

XR12 

XR12 

XR12 

XR12 

XR12 

STKIN 

STKA 

XR5 

XR12 

ABAS 

f1 

11 

ri 

11 

11 

M 

ir 

11 

II 

n 

II 

11 

EASS 

ABEX 

STACK 

STACK 

STKMI 

STACK 

STACK 

STKUN 

STACK 

STACK 

STKA 

STKLP 

EEXP 

EEXP 

EEXP 

NEG 

UNOP 

UNOP 

UNOP 

UNOP 

UNOP 

XR4 

UNOP 

UNOP 

XR4 

11 

UNOP 

UNOP 

UNOP 

=(assign) 

STACK 

STACK 

STKMI 

STACK 

STACK 

STKUN 

STACK 

STACK 

STKA 

II 

XKOMMA 

XR5 

AS SOP 

index 

II 

tl 

ri 

TI 

11 

II 

ir 

11 

II 

If 

XR7 

AD RES 

XR7 


For reading convenience, in this figure the routine-names are filled into the entry. In the compiler, the 
entries are bytes and access is via XCGCJ. 


FIG. 4 
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6 .2 Data structures used In exp-processing 

6.2.1 Compile-time stack for operators and operands (CTS) 


SP 


XCGB-ac 

[dress 

LOG 

MODE 


0peran4“value 




XCGB-a4dress 


LOG 


MODE 


I 


0 peran4-value 

I 


low high 

byte byte 


operator-entry 


OPERAND-ENTRY 


operator-entry 


Operand-entry 


FIGURE S 


As shown in the figure, operands and operators are stacked 
on the same stack (with PUSH/POP-operations). 

A variable, TOPST, is used to mark if an operator (TOPST=0) 
or an operand (T0PST=1) is currently on the top. 

A variable, TOPT, is used to hold the operator highest on 
the stack. TOPT must thus be updated when stacking/unstacking 


occurs. 
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An operator-entry consists of the 2-byte address (absolute) 

of the operator in the table XCGB. 

The same representation of operators is used in the variables 

TOPT and INSYM (= last read operator). 

An operand entry consists of 4 bytes: 

Value: (2 bytes) is the address of an identifier, the 

value of a constant, or the address of the base 
for a based identifier. 

LOG: is an identification of the program segment where 

the operand is located. Possible values are MSEC, 
PSEG, DSEG or 0 for constants. 

MODE: describes the type of the operand. 
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6.2.2 Operand modes 

The mode of an operand when it is stacked on GTS, consists 


of one byte. 

It has a value 1-13 in the 4 lower bits 

, and 

possibly 

some 

flags set in the higher 4 bits. 


The possible 

values in the lower bits are: 


I REG: 

1 

Result in DE-registers 


BREG: 

2 

Result in A-register 


INHL: 

3 

Result in HL-registers 

1 

BADR: 

4 

Address of byte in HL 


I ADR: 

5 

Address of address in HL 


BBAS : 

6 

Based byte identifier 


IBAS: 

7 

Based address identifier 


BDIR: 

8 

Non-based byte identifier 


IDIR: 

9 

Non-based address identifier 


BOON: 

10 

Byte constant 


ICON: 

11 

Address constant, or dotted constant 

(list) . 

ISTP: 

12 

STACKPTR 


BOUT: 

13 

OUTPUT 


The flag- 

bits 

of the mode have the meaning: 


Bit 4: 

set 

= dotted identifier (based or not) 


Bit 5: 

set 

= operand allowed left of assign 


Bit 6: 

set 

= index allowed after operand 

1 

Bit 7: 

set 

= special processing for constant 

indeces. 


Looking at fig. 1, we see that operands come from two sources, 
from a PRIMARY-call, and as a result of codegeneration. 

The non-result operands, 6-13, come from PRIMARY, the 
others from PRIMARY or code-generation. 
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When an operand is popped off the CTS for code-generation, 
the mode may be changed before it is used as index in the 
code-byte tables. This change is done by the routines 
XPLOPN/XPROPN and RELTYP/ARELTYP. 

The resulting operands have no flag-bits, and the values 
may be: (in addition to 1-13 above) 


BREL: 

12 

Byte value 

on RTS 

IREL: 

13 

Address 

value on RTS 

CBAS: 

14 

As BBAS 

for reentrant 

JBAS: 

15 

" I BAS 


tl 

CDIR: 

16 

" BDIR 


tl 

JDIR: 

17 

" IDIR 


M 

JCON: 

19 

Dotted : 

id 

for reentrant 

BMRL: 

20 

Address 

of 

byte on RTS 

IMRL: 

21 

Address 

of 

address on RTS 

BINB: 

22 

Byte value 

in B-register 

IINB: 

23 

Address 

value in BC-register; 


(RTS = Run time stack) 

6.2.3 Tables used in exp-processing 
(see figure 3 and 4) 

XCGA 

Index: SCANSYM value of a symbol 

Entry: 1 byte, giving the symbol's relative XCGB address 

Used: by routine EXPSYM, which computes the symbols 

absolute XCGB address, this is the internal 
representation of the symbol during exp-processing. 





XCGB 

Index: 

Entry: 


Use: The 

XCGC 

Index: 

Entry: 

Use: 


See XCGA 


Up to 6 bytes. 


1 . byte: 

2 . byte: 

3. byte: 

4. byte: 

5. byte: 

6 . byte: 


Symbol's index in XCGC, both as left and 
right index. 

Flag to mark generation of an increment 
instruction. 

Type of code generation table. 

Operator for low byte 
Operator for high byte 
Result type of instruction. 


entry contains all information about an operator. 

1. byte is used to access XCGC, 

3. byte to access code tables and 

2., 4., 5., 6. bytes are used by the.code generation 
routine. 


See XCGB 

1 byte, which again is index in XCGCJ (JUMP-table 
for XCGC) 

The table is used to decide the action to be 
performed for all combinations: 
top stack operator / incoming operator. 
(TOPT/INSYM) 

The actions are mainly of two types, stacking and 
unstacking. 
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XCGCJ 

Index: See XCGC 

Entry: 1 word = address of action routine (XCGC) 

Use: it allows the XCGC entries to be 1 byte instead 

of 1 word, and this saves space since many of them 
point to the .same routine. 


THE CODE BYTE TABLE 


and 

ACCESS TABLES TO CODE BYTES , see 'generation of code for 

assignment and expression'. 

6.2.4 Access routines to tables 

For each table there is usually a routine to access it. 

This is done because the table access often requires a 
long sequence of instructions and the logic is confused 
if these are inserted among other instructions; 

The routines are: 

EXPSYM: to access XCGA - XCGB 

ACTION: to access XCGC - XCGCJ 
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6.2.5 

ADMUL: 

ADDIV: 

LASTR: 
TOP: 
BCOCC: 

OPN12: 

INSYM: 

TOPT: 

OPNL: 

OPNR: 

TOPST: 

RESMOD 


Variables used in exp.-processing 

Holds address of multiplication routine, after It 
has been Inserted In the code. Before this, ADDMUL = 0. 

Similar for division routine. 

See 'Runtime-stack' and 'Result codes' 

If ft 11 It fl II 

II It II II II II 


Used In ASSOP to mark that STACKPTR Is allowed as 
operand. Tested In XPOPN. 

Holds the absolute address of the XCGB-entry for the 
last read operator, until It Is stacked on GTS. 

Content as INSYM, but for the operator currehtly on 
top of GTS. Must thus be updated when stacklng/un- 
stacklng occurs. 

3 bytes. Holds the left operand when code generation 
starts. 

1 byte = LOG, 2. and 3. byte = value. 

See compile time stack, operand entry. 

MODE Is In B-reg. 

As OPNL, but for right operand. 

MODE Is In C-reg. 

=0 when operator on top of GTS 
=1 when operand on top of GTS 

Set before code generation to mark the type of the 
result (0 = byte, 1 = address). 
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IDFLAG: 

OPTYP: 

OPTOR: 

RESULT: 


CURHL: 


1 byte 
1 " 

2 bytes 
1 byte 

These correspond to bytes 2-6 in the XCGB-entry 
for an operator. They are loaded from the entry 
before the code generation for that operator, 
and used by the code generating routines. 

In addition, OPTOR holds the address of the 
XCGB-entry, between the unstacking of the operator 
and the loading of the entry. 

5 bytes 

Used in the HL-optimization system. 
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6 .3 Expression parsing and syntax checking 

6.3.1 Main routines; EXP, BYTEXP, ADREXP, LOGEXP, ASSIGNMENT 

EXP, BYTEXP, ADREXP and LOGEXP call the routine lEXP that 
processes the expression until It Is reduced to a single 
operand. Note that If the expression consists of one operand, 
e.g. a constant, no code has been generated when lEXP 
returns. The different entries then generate the final 
Instructions to load the operand Into A : EXP, BYTEXP 

or DE: EXP, ADREXP 

For LOGEXP a 'jump false' Is the last byte to be generated. 

If the final 'result code' (see these) was 8 or bigger, one 
of: JC, JNC, JZ, JNZ Is generated. 

Otherwise, code Is generated as for BYTEXP and then a 'RRC' 
followed by 'JNC' are generated. 

lEXP pushes a simulated 'left operand', then pushes 
'begin expression' and enters the common processing. 

In ASSIGNMENT, PSW, BC, DE are saved. 'Begin assign' Is pushed. 

The first call for an operand Is not to PRIMARY, but to IDPRIM, 
since on entering both an Identifier and the following symbol 
has been read, and a reference to the Identifier Is In HL. It 
Is passed on to IDPRIM, and then the common exp-processlng Is 
entered. 

6.3.2 ACTION routines - stack 

STKAS: Stacks a '=' as an assign operator, and changes LASTR 

If last operand was an array element. See 'Result codes'. 

STKA: Stacks ',' and else as STKAS. 

STKMI: Stacks a unary or binary depending on the operand 

situation. 
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STKUN: 

STKLP: 

STKIN: 

STACK: 

6.3.3 

BINOP: 


ADRES: 


ASSOP: 


Stacks a unary operator. A dummy byte constant is 
first stacked, to make later processing similar to 
the binary operators. 

Stacks 'left parenthesis' or 'index left parenthesis', 
depending on the operand situation. 

Stacks 'index left parenthesis'. 

Stacks ordinary binary operators. 

ACTION routines - unstack 

Pops an operator and two operands off the CTS. 

Calls the code generation routine GENOPT, and 
pushes the resulting operand on to CTS. 

Jumps to GTOPT, where the same incoming operator 
and the operator now on top of CTS (after the un¬ 
stacking) decides which is the next action. , 

Same as BINOP, but for the 'index operator'. Checks 
the mode of the left operand to see if index is 
allowed. If the right operand, the index is a constant 
and the left operand allows it, the two operands are 
transformed to one, without code generation 
(B(3) = .B + 3). Else GENOPT is called as in BINOP, 
and the result pushed. Jump to STCOPN to read the 
next symbol (in contrast to BINOP), since ADRES is 
invoked by a right parenthesis. 

As BINOP, for the assign operator (not :=). 

In addition: 

The next operator on CTS, if comma, is changed to 
assign. If not, (it must then be 'begin assign'), 
a flag, 0PN12 is set to allow STACKPTR and OUTPUT 
as operands. The left operand's mode is tested to 
check that it is allowed left of assign. Calls the 
code generating routine GENASS, before jumping to 
PUSHOPN in BINOP to push the resulting operand. 
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lASOP; 

UNPAR: 


6.3.4 

EEXP: 


EASE: 

6.3.5 

XINIT; 

POP2: 


As ASSOP, but for :=, where STACKPTR and OUTPUT are 
not allowed. 

Invoked at the combination leftpar/rightpar. 

Pops an operand of CTS, pops the leftpar off CTS, 
and then again pushes the operand, after the mode 
has been changed so it is not allowed left of assign 
or to be indexed. No code is generated. 

Jimps to STCOPN, as ADRES. 

ACTION routines - end of processing 

Arrival here means end of the lEXP processing. 

The two operands are popped, the left is the dummy 
from lEXP, the right is the result-operand for the 
expression. This is modified by XPROPN, then the 
routine returns to after the lEXP call in on^ of 
the entries EXP, BYTEXP, ADREXP or LOGEXP. 

Terminates processing entered through ASSIGN. 

Operand and 'begin assign' are popped off CTS, 
and SCANBACK called as in EEXP. Registers are 
unsaved. No code is generated. 

Utilitly routines for ACTION routines 

Clears LASTR, TOP, BCOCC, 0PN12. 

Pops 2 operands and 1 operator off CTS and places 
their values in the appropriate variables. Operand 
modes go to BC. Updates TOPT. 
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XPLOPN: 

XPROPN: Converts the inodes of left/right operand (see 

'operand modes') as follows: 

The bits of the mode that mark index allowed, left- 
of-assign allowed etc. are cleared, and dotted 
identifiers are converted: 
dotted non-based id. = address constant 
dotted based id. = address id. 

Finally, if the operand is an identifier in DSEG, 
and reentrant code is to be generated, the mode is 
changed to one of the 'reentrant-identifier' modes. 

ADDCHK: Called from BINOP before code is to be generated 

for an operation. If the operator is + or -, and 
the right operand is byte constant = 1, the operator 
is changed to INC or DEC, which have their ow'n entries 
in XCGB. 

This is done to simplify the code in these cases. 

If the operator is +, and one of the operands is 
of type address, the operator is changed to DAD, 
which also has an entry in XCGB. The code generated 
will then use the hardware DAD instruction. 

OPNRl; Tests if right operand is a byte constant = 1. 

Zero bit as result. 

C0MP2: Compares the contents of DE and HL. 

Zero bit set = equality. 

DE distroyed. 

6.3.6 Error routines 

The error routines, and only these, have labels of the form 
XR number. Some of them are ACTION routines, accessed via 
the table XCGC. Others are called from the other ACTION- 
routines. 
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The ERROR message in each shows in which case the routine 
is called. 

1. Insertion of missing operands 

2. Changing of wrong operands 

3. Removal of superfluous operands 

4. Removal of wrong operators (with an operand) 

The purpose has been to allow processing beyond an error by 
maintaining a correct structure of the CTS. Nothing is 
guaranteed for the generated code, since operands may be 
removed, the RTS may be incorrect. 

There is an ABORT-call in LD20PN in the code generation 
part (see LD20PN). 
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6.4 Run-time use of registers 

Since all byte arithmetic uses the A-register, it is natural 
to choose this register to hold byte results during processing 
of an expression. 

The double add instruction DAD gives the result in HL. For 
all other operators, however, that must be performed in two 
steps for double byte, it is sometimes inconvenient to place 
the result in HL (if one of the operands is given by two 
M references). It was .therefore decided to place such results 
in DE. The operand mode of the result then shows in which 
register the result is placed. 

During execution of an expression, e.g. (I+J)+(K+L), with all 
identifiers of byte type, the result of I+J in A-reg must be 
saved somewhere while K+L is computed. 

V 

# 

The philosophy behind the saving of intermediate results on 
the run-time stack can be described as follows: 

There is a choice between two main strategies: 

1. To save a register first when it becomes necessary, 
because the same register is required again. 

2. To let the saving 'follow' the compile time stack, 
i.e. save the register every time a result is pushed 
on to the CTS. 

At first sight, method 1 would seem to generate fewer un¬ 
necessary save instructions. However, the saving will make 
the access of results more difficult, since a result is not 
necessarily on top of the runtime stack when needed, as it 
will be with method 2. 
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Method 2 was therefore chosen, in principle, with an 'extended' 
run-time stack. The extension consists of the BC registers, and 
two 'virtual' cells, to minimize the save instructions actually 
generated. 


6.4.1 Run-time stack 

The hardware-defined stack with stack pointer SP, and the 
operations PUSH and POP, is used as run-time stack. For opti- 
malization reasons, it is extended, as the figure shows: 


SP - 

— 


BCocc 


LAsn? 


TOP 




STACK 


LASTR, TOP and BCOCC are byte variables. The two first may hold 
a register code, the last the values 0/1 (false/true). 

This means that the stack is extended: first with the registers 
BC, and then by two 'virtual' cells. Stacking a result on top 
of stack is simply done by setting LASTR = result code (see 
these) for the result. No code is generated. 
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Stacking of another result may then be done by: 

TOP = LASTR 
LASTR = new result 

and still no code generated. 

If the two results were in the same register, this is of 
course not possible, since no real saving is involved. We 
would then have 

Generate 'MOV BC, old-result-register' 

LASTR = new-result-register. 

In short, TOP and LASTR may never hold the same result code. 

BCOCC tells if there is a result 'stacked' in BC. 

For this mechanism to operate as a stack it is crucial that 
the sequence of the results is not changed. 

Example: if LASTR, TOP and BC all are occupied, and a new 
result is to be saved, we must gov through: 

Generate 'PUSH B' 

Generate 'MOV BC, old TOP-result' 

TOP = LASTR 
LASTR = new result. 

It is not necessary that BC, TOP and LASTR are all filled. 
E.g., if we 'pop' the result in BC, that is, use it in code 
generation, we do not then automatically generate a 'POP B' 
to get the next result up in BC,. This might generate unneces¬ 
sary code. 
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Not all operands are saved via the BC-registers, as 
described above. 

If the code-generation is 'reentrant' this cannot be done, 
since BC then is used as base-register. 

For non-reentrant code generation, the result codes IREG, 
INHL (value in DE or HL) lead to immediate PUSH. For the 
other result-codes, the value is saved in BC. 

See routines SAVR/PSHR.- 

6.4.2 Use of the run-time stack 

We will now describe the generation of code for an operation 
Operand 3 = operand 1 operator operand 2 
and the saving/unsaving necessary. » 

At compile-time operand 1, operand 2 and operator are popped 
off the compiletime stack, and after the code is generated, 
operand 3 is pushed on it. 

A safe (but extra code generating) method is to copy this at 
run-time, and generate 

POP operand 1 
POP operand 2 
code for the operation 
PUSH operand 3 

Operand 1 and operand 2 may have modes 1-11. 

If the mode > 5, the operand is not the result of a previous 
computation, and the POP-instruction will not be needed. 
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In the following we will call such operands NR-operands 
(Non Register), and operands with modes _ 5 R-operands 
(Register). 

We shall now study how the above code sequence reduces when 
we use the extended run-time stack described in the previous 
chapter. 

First, we note that the 'PUSH operand 3' always is: 

LASTR = operand 3 and thus no code is generated for it. 

The code-sequences may be divided in 3 types: 


1. Operand 1 and operand 2 both NR. 

No POP operations involved. 

The operation generates a new result to be 'saved' in 
LASTR. The old LASTR and TOP, if any, must then be 
saved. 

The code byte table contains, for each code sequence 
to be output, information as to which registers the 
sequence will destroy. (The second of the 3 bytes 
preceding the sequence). 


With the aid of this, and LASTR, TOP and BCOCC, the 
routine SAVE may generate the necessary instructions. 

If there is a LASTR 0 they may vary from the simplest 

TOP = LASTR (no code generated) 

to 'PUSH B' 

'PUSH TOP' 


'MOV BC, LASTR' 
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2. One (but not both) of operand 1 and operand 2 is NR. 

The NR-operand will be in LASTR, so no POP-instruction 
is needed. 

However, if TOP is occupied, the code sequence may destroy 
that register. 

SAVE is called, as for 1. 

3. Both operand 1 and operand 2 are R-operands. 

Operand 2 is in LASTR, so no 'POP operand 2' is needed- 

If TOP ^ 0, TOP = operand 1, so no save instructions 
are needed. 

Operand 1 may be 3 places: » 

a. in TOP, no POP is required, and code is generated 
directly for the registers given by TOP and LASTR. 

The 'POP' consists in setting TOP = 0. 

b. In BC, no POP is required, and code is generated 
directly for BC and the LASTR register. 

The 'POP' consists in setting BCOCC = false. 

c. On the stack, a 'POP' instruction is then generated. 

To accomplish the different code generation needed for a, b 
and c above, the routines RELTYP (for arithmetic operators) 
and ARELTYP (for left-of-assign) change the mode of the 
left operand, when both operands are in registers. 
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6.4.3 Result codes 


The result codes, possible values of LASTR and TOP, show 
where the result of a computation is located. They are: 


IREG = 1 : 
BREG = 2 : 
INHL = 3 : 
BADR = 4 : 
lADR = 5 : 
RESZ = 8 : 

RESNZ= 9 : 

RESC = 10: 

RESNC= 11: 


result in DE 
result in A 
result in HL 

byte-result address in HL 

address-result address in HL 

Logical result in zero bit and A-reg: 

true <=> Z = 1 & A = 0 

Logical result in zero bit and A-reg: 

true < = > Z = 0 & A 0 

Logical result in carry bit 

true <=> C = 1 

Logical result in carry bit 

true <=> C = 0 


We see that the first five correspond to the first five operand 
modes, with the following slight modification: 


When an assign operator is stacked, the operand 
on top is checked. If it has mode = BADR, LASTR 
is changed to INHL. This is done because of a 
difference in the saving of such operands: left 
of assign their address must be saved, elsewhere 
it is more convenient to save the value. Note 
that the operand mode on the stack is not changed. 

The others are used as follows: 

8, 9, 10, 11: The relation operators =, <>, <, >, <=, >= shall 
according to the PLM-definition produce a byte- 
result consisting of either all zeroes or all one's. 
If the last operation performed in an expression is 
one of these, e.g. ^K<J+1 then 


very often the resulting byte-value will not be 
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assigned to anything, but just tested to 
generate a conditional jump. (This is the 
case in the example above.) 

As an example, let us study the code for X = Y then ..., 
and for Z = (X=Y); X, Y, Z are byte identifiers. 

The last would be: 



LDA 

X 



LXI 

H, Y 

• 


XRA 

M 



CPI 

1 



SBB 

A 



STA 

Z 


The 

instructions 

CPI 1, SBB A are used to 'transform' the Z 

to a 

byte 

value. 

In the if-statement the code following the 

XRA 

instruction 

could be: 


CPI 

1 



SBB 

A 



RRC 




JNC 

false 



but better just: 

JNZ false 

For this reason, the code sequences for the relational operators 
stop with the XRA-instruction (SUB for <, >...) and LASTR is set 
to 8-9-10-11 to mark which condition represent the true . If the 
logical byte in A-reg is needed, the instructions to compute it 
(CPI 1, SBB A for =) are generated by the routine FIXRL. 


If expression was entered through LOGEXP, and the last operation 
to be performed was a relation, LASTR will have one of the values 
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8, 9, 10, 11 when EEXP is reached. The first byte of a 
conditional jump instruction is then generated, since 
the calling routine (if-statement, while-statement ...) 
always needs a 'jump false' following the evaluation of 
a logical expression. 

Note that the operand on the CTS corresponding to a 
LASTR = 8, has the value 2 (A-reg). This works, because: 
if the operand is used in further computations, FIXRL will 
be called by the code generation, so that the result is 
really in A when used. 
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6.5 Generation of code for expressions and assignments 

1. The entries GENOPT, GENASS and GENLDR with utility 
routines. 

2. The CODE BYTE TABLE. 

3. ACCESS TABLES TO CODE BYTES. 

4. The HL-optimization system. 

Since execution of the routines mainly consists if 
processing the tables, -these are described first. 

6.5.1 The CODE BYTE TABLE 


This table consists of: 

1. A collection of codebyte sequence to be output. 

2. 5 special codebyte entries. ' 

3. The routine CPRO, which processes the special entries. 

1. The code sequences contain either code for one load 
operation or for one store operation. The operator 
code is generated by the utility routines CONTN and 
CONTD. 


The structure of one of the load sequences is as follows 

DB S,W,D 
<instructions> 


JMP <adr> or CALL <adr> 

The <instructions> are the bytes to be actually output. 
They may contain the byte OPERAND. If so, the operand 
in OPNL (left) or OPNR (right) is output instead of 
OPERAND. Otherwise they are output as they are. 
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The bytes S, W and D contain information about which 
registers the <instructions> use. Each of them 1 

contains a 'register code' which is a combination of 
the five possible A, D, E, H or L. 

HL means H + L 

DE " D + E 

ALLREG " HL + DE + A 

S (source) shows the register(s) the operand occupies 
before <instructions> are executed. 

W (way) shows the register(s) that <instructions> 
will destroy. 

D (destination) shows the register(s) occupied after 
the execution of <instructions> . 

In addition D may contain a flag, R2FIRST, which is 
tested in the utility routines LD20PN/TRY. 

Examples: 

1: The operand is a based address to be loaded into DE 


Code: 

LHLD 

OPERAND 


MOV 

E,M 


I NX 

H 


MOV 

D,M 

S,W,D 

= 0, 

HL+DE, DE 


2: The operand is the address of a byte (e.g. 

resulting from an array element) to be loaded into A. 

Code: MOV A,M 

S,W,D = HL, A, A 
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The 'CALL' or 'JMP' instruction terminates <instructions>. 

'CALL' means that the processing is to continue with the 
<instructions> of the sequence reference by <adr>. 

'JMP' means a transfer of control to <adr>. 

<adr> is either a special codebyte entry (see 2.), or a 
termination label for the code sequence. If the latter, a 
hard-ware code for the-register(s) now occupied is loaded 
into HL. This value is later used by CONTN and CONTD to 
generate the operator-instructions. 

The structure of the store sequences are similar, except 
that they start with only 

DB W 

instead of S, W, D. 

2. The special codebyte entries are: 

LXIH, LHLDH, LOIR, LHASA and LEASE. 

They all have the structure: 

MVI A,INFO 
CALL CPRO 

<instructions> 

JMP NOVAL. 

They are accessed by 'JMP'-instructions in the code 
sequences in 1. 

Their function is to output the bytes in <instructions>. 

Afterwards, the processing continues with the bytes 
in the sequence from which the access came. 

The INFO is the INFO-byte in CURHL, see 'Variables used 
in exprocessing'. 
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3. CPRO outputs the <instructions> in a special 
codebyte entry. 

It also calls on the HL-optimization system to see 
if the output is necessary, or can be substituted by 
a few 'INX H's or 'OCX H's. 

6.5.2 ACCESS TABLES TO CODE BYTES 

There are several of these, one points to store sequences, 
the others point to load sequences. Index in each of them 
is an operand mode. Each entry consists of 1,2 or 3 addresses 
of code sequences. 

The tables are: 

LDLEFT N Used by GENOPT for operands of 
LDRIGHT ( type SYMM, ASYM and XSYM. 

J Load left and right operand respectively. 

LDDLEFT ) Used by GENOPT for operators by 

LDDRIGHtJ type DELE. 

LDBYT Used by GENLDR for the final 

LDADR I load in BYTEXP and ADREXP respectively. 

LDMIX Used by GENLDR for the final load in EXP, by 

GENASS to load right operand, and by GENOPT for 
operators of type LOAD. 

STLEFT Used by GENASS to store into left operand. 
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6.5.3 The entries GENQPT, GENASS, GENLDR 

GENOPT: Generates the code for an operation. 

Input is operand modes in BC-registers and 
operator in OPTOR. 

First RELTYP is called to modify left operand 
mode if necessary. 

Then, depending on OPTYP, an access table to code 
bytes is chosen. 

For increment/decrement-operations we have: 

Code to load one operand is generated by LDIOPN 

Code to generate the incr/decr is generated 
by GENINC. 

For all other operators we have: 

Code to load the two operands is generated 
by LD20PN 

Code for the operator is generated by CONTN 
or CONTD. 


GENASS: Generates the code for an assignement. 

Code to load the right operand into A, DE or 
HL is generated by LDlOPN. 

Code to store the result in the left side is 
generated by STlOPN. 

GENLDR: Uses LDlOPN to generate code to load right operand 

into a register. If the register was HL, an 'XCHG' 
is generated, so the result will be in A or DE. 
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6.5.4 

GENING: 

SAVE: 


Utility routines for code generation 

Generates the increment/decrement-instructions 
for the +1 operator, and the increment necessary 
for the unary- operator. 

See 'Runtime use of registers'. 

The routine operate as follows: 

procedure SAVE(X); 

comment X is registercode for registers that must 
be saved. (See CODE BYTE TABLE); 

begin A = 'registercode for LASTR' 
if A and X = 0 

then comment TOP-save; ' 

begin if TOP =/= 0 

then begin call CLRBC; 

call SAVR(TOP); 

end ; 

TOP = LASTR; 

end ; 

else comment real save; 
begin call CLRBC 

if TOP =/= 0 

then begin call PSHR(TOP); 

TOP = 0; 

end ; 

call SAVR(LASTR); 

end ; 

end SAVE; 
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CHANGE: Interchanges the two operands, in cases where this 

is required, or possible to generate better code. 

The choice depends: 

1. On the type of the operator (in OPTYP) 

2. On the type-combination of the operands 
(byte/addresses ...) 

3. In case of a SYMM-operator with byte/byte- \ 
operands, on a 'priority' of the byte-modes 
defined in the table L3 in the routine. 

RELTYP; 

ARELTYP: inspects the operand-modes in B (left) and 

C (right). 

See 'Run-time use of registers'. 

# 

If both are non-reg, nothing is done. 

If one is non-reg 

LASTR = TOP 
TOP = O 

is done, this is a preparation,for a later SAVE-call. 

If both are reg, we have 3 cases: 

1. TOP 0 

TOP = 0, left mode is not changed 

2. BCOCC 0 

BCOCC = 0, left mode is changed to BINB or IINB 
(sec Operand modes) 

3. Else left mode is changed to 

BREL, IREL, BMRL or IMRL. 

(See Operand modes). 
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ADSAVE: Input to SAVE is a byte containing information 

about which registers the code sequence to be 
generated will destroy. For binary operators this 
sequence consists of 3 parts: loading of the 
2 operands and the operator code generated by 
CONTN or CONTD. For the 2 operands the information 
is taken from the W-bytes preceding the sequence 
in the CODE BYTE TABLE. The routine ADSAVE is used 
to add the information for the 3.part, the operator 
sequence, after the two first have been 'or'-ed 
together. 

LDlOPN: Accesses one of the load sequences in the CODE BYTE 

TABLE, calls SAVE and then goes to TPRO to output 
the sequence. 

STIOPN: As LDlOPN, but for the store sequences in the 

CODE BYTE TABLE. 


LD20PN: Accesses via ACCESS TABLES TO CODE BYTES 2 load 

sequences for left operand and 2 for right operand, 
and has to make a 'choice' of a possible left-right 
combination. 


The choice for each combination is made by TRY. 

If the two left-alternatives are called Ll and L2, 
and the right Rl, R2, they are they are tried 
as follows: 


1 : 

2 : 

3: 

4: 


Ll-Rl 

L2-R1 

L1-R2 

L2-R2 


If a combination is possible, TRY does not return, 
so after the fourth TRY, call an ABORT call is 
included. This means that the compiler has met 
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TRY: 


an operand combination for which the ACCESS TABLES 
TO CODE BYTE do not contain a possible combination, 
and should hopefully never occur. 

If the D-byte of Rl above contains the flag R2FIRST, 
alternative 2 is skipped. This is done for 
optimization reasons. 

Input is pointers (DE,HL) to two load sequences 
in the CODE BYTE TABLE, one for left (L), the other 
for right (R) operand. TRY is to decide whether 
these sequences will destroy each other's use of 
registers. If we use the dot-notation for the 
S, W, D-bytes of L and R respectively, the 
algorithm is: 

If 

(L.W and R.S) or (L.D and R.W) = 0 

it is possible to execute first L, then R. 

If 

(R.W ar^ L.S) or (R.D and L.W) = 0 

it is possible to execute.first R, then L. 

The B-reg is set to mark which of the two was chosen. 

If none was possible, TRY returns to after the call. 

When a possible combination is found, TPRO is called 
twice (via 01) to output the chosen code sequences. 

The stack-pointer is modified, so that the return will 
be after the LD20PN-call. 

Kxx NB Kxx Thus, if any routines are inserted/removed 
in the call-sequence LD20PN-TRY-T1-01-TPR0, this stack- 
pointer modification must be updated. 
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TPRO: 


OOPN: 


MAHLM: 

MHLM; 

ADDMl/2/3 

LOADH: 

CODPT: 


processes one entry in the CODE BYTE TABLE until 
a 'JMP' is met. It then returns via one of the 
labels AVAL, BVAL ... in this table. At these 
labels, HL is loaded with the hardware register 
codes required by CONTN/CONTD. Also a result code 
is loaded into A at some of the entries. 

is called from TPRO when an OPERAND-byte is met. 

It decides whether it represents an address or 
byte operand•(OPERAND is followed by a 0-byte 
or not) and whether left or right operand is 
indicated (B-reg set in TRY). It then outputs 
the operand from OPNL or OPNR. 

and 

are 'short-cuts' for much-used codesequences. 

are 'access routines' to access tables for code 
bytes, for tables with 1/2 or 3 words in the 
entries respectively. 

is used by GENASS to ensure that an address constant 
is loaded into HL instead.of DE in certain cases, 
to avoid unneccessary 'XCHG'es. 

loads the XCGB-entry of an operator into IDFLAG, 
OPTYP, OPTOR, RESULT, which are used by the code 
generation routines. 

Clears BC: 

If BCOCC = true 

then generate 'PUSH B'; 

BCOCC = false; 


CLRBC: 
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EXPSAVE: This is not a utility routine,.it is called from 

outside the exp-processing, from PRIMARY. It is 
described here, since it uses utility routines. 

It is: (A is input SAVE-byte). 

call FIXRL 

if 'HL to be saved' then CALL CLRCURHL 

^ A = ALLREG 

then begin call CLRBC ; 

call PSHR(TOP) ; 
call PSHR{LASTR) ; 

end ; 

else call SAVE(A); 

FIXRL: 

This routine generates the code necessary to transform 
one of the result codes RESZ - RESNC to a logical byte 
(OOOOB or llllB) in the A-reg. The code generated is: 

Result-code 


RESZ 

CPI 

1 

; only A = 

0 sets carry 


SBB 

A 

; carry = 

1111, no carry 

RESNZ 

CPI 

1 




CMC 





SBB 

A 



RESC 

SBB 

A 



RESNC 

CMC 





SBB 

A 




0000 


This routine is called from outside the code generation, 
from the ACTION routines. 


SAVR and PSHR 


These routines generate code to save a register in BC or 
on the stack, depending on the result code in LASTR or TOP, 
and on 'reentrant' mode or not. 

The code generated is: 


Result- 

code 

by SAVR 

non-reent. 

by PSHR 

non-reent. 

by SAVR/PSHR 

reentrant 

I REG 

PUSH D • 

PUSH D 

PUSH D 

BREG 

MOV B,A 

PUSH PSW 

PUSH PSW 

INHL 

PUSH H 

PUSH H 

PUSH H 

BADR 

MOV B,M 

MOV B,M 

PUSH B 

PUSH H 

lADR 

MOV C,M 

INX H 

MOV B,M 

MOV C,M 

INX H 

MOV B,M 

PUSH B 

PUSH H 
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CONTN: 

On arrival here, BC and DE contain the hardware 
register codes for left and right operand, set by 
LD20PN. OPTOR, low and high byte contain operator 
codes. With this information code is generated, 
generally following the pattern; 


MOV A,LL ; II 
OPL RL ; 12 
MOV E,A ; 13 
MOV A,LH } 14 
OPH RH ; 15 
MOV D,A ; 16 


OPL, OPH are operators for low and high byte 
LL = register where low byte of left opn is placed 
LH = register where high byte of left opn is placed 
RL = register where low byte of right opn. is placed 
RH = register where high byte of right opn. is placed 

Not all the above instructions are always generated. 

Main exceptions are; 

ex.a. if both operands are of type byte, only II and 

12 are generated, the result is therefore in 
A-reg. Otherwise as seen, the result is in DE. 

ex.b. if the operator is one of <, >, <=,> =, 

13 and 16 are not generated. 

ex.c. II is not generated if LL = A-reg. 





ex.d. The sequences for =, <>, AND, OR, XOR are 

somewhat shortened, this is done by subroutines 
STORE and LHZERO. 

To generate 11-16, a set of routines is used. They are: 


MOVA: 

OPRAT: 
STORE: 
INCH: 

LHZERO: 


generates II and I4 
generates 12 and 15 
generates 13 and 16 

generates 'INX H' if one operand is address 
of address. 

generates special code if LH = 0, see ex.d. 
above. 


The instructions are 'calculated' rather than takdn from 
tables. This is possible since the hardware is sufficiently 
systematic. The operator-codes OPL and OPH have the values 
(OP B) - 1. (E.g. (SUB B) - 1 for subtraction). 

The registercodes LH, ... have the values 1, 2, ... 7 for 
registers B, C, D, E, H, L, M respectively. This will 
give an operation instruction when operator code and 
register code are added. 

Calculation of 'MOV A, register' is done similarly. 

The 'register codes' for immediate operands (8) 'LDA'- 
instructions (16), and CMA (FPH, no operand) must be 
tested and treated separately. 
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CONTD: 

On arrival here, registers BC and DE contain register 
codes, as for CONTN, the only possible are BC, DE and 
HL. Depending on the operator code in low-byte of OPTOR, 
code is generated for one of the operators: 

Double add, routine CDADBD 

Multiplication, routine CMULT 

Division, routine CDIV 

Index calculation, routine CIND 

The operator MOD is treated as division, only the setting 
of LASTR differs, so that the result of the division is in 
DE, the MOD result in HL. (See XCGB entries) 

Multiplication and division are performed by routines which 

are generated in-line the first time one of these operators 

* 

occurs. This is done by the routines GENMUL and GENDIV. 
GENMUL/GENDIV 

The structure of these are similar, so only GENMUL is 
described: 

The value of ADMUL is tested. If it is 7 ^ 0, the multipli¬ 
cation routine has already been output by an earlier call, 
and only the call to this address is generated, (by GCALL) 

If ADMUL = 0 the routine must be output. It consists of the 
instructions from label Jl up to label Ll. They cannot be 
directly output, since all addresses in local jump instruc¬ 
tions must be changed. The table Ll contains a list of the 
length of all the intervals between the jump instructions. 

The processing uses the subroutines: 

GCALL: to generate the call on the multiplication routine. 



72 


INIBD: to set BC to the difference between 

1 . the address where the routine will be output. 

2 . the address of the routine where it lies in 
the compiler. 

This value in BC is used to update the jump 
instructions. 

TABP: to process the table Ll until a 0-byte, by calls 

on GBYT-and GPADR. 

GBYT: Outputs the bytes between jiimp instructions 

directly. 

GPADR; Outputs the address of a jump instruction, 

after it has been modified by the value of BC 
(see INIBD). 

6.5.5 The HL-optimlzation system . 

The system consists of the variable CURHL (5 bytes) and 
the routines TSTCURHL, SETCURHL, INCCURHL, CLRCURHL and 
PUTIH. 


The purpose is to avoid the generation of consecutive 
HL-loads with the same address, or to substitute such 
HL-loads with a few 'INX H's or 'DCX H's when possible. 

All loading of HL is done via the special codebyte entries 
in the CODE BYTE TABLE. These are all processed by CPRO, 
which calls TSTCURHL and SETCURHL. 
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CURHL: 


3 


SETCURHL: 


The 5 bytes have the following contents; 

1. INFO-byte 

2. LOG of last operand loaded into HL 

-4. Value " " " " " 

5. Number of 'INX H's generated. 

LOG and value : see Gompile time stack 
INFO has the form 2xN + I 

N is nr of bytes in the codesequence necessary 
to load the operand. 

1=0 for non-based, = 1 for based. 

Ex.: The sequence 'LHLD operand' 

has INFO = 2x3+1 = 7, 

(consists of 3 bytes, loads address of 
based variable). 

Is called from GPRO each time code has been 
generated to load HL with an operand. 

GURHL = INFO from the special codebyte entry used 
GURHL+1 = operand.LOG (see GTS) 

GURHL+2,3 = operand.value (see GTS) 

GURHL+4 = 0 ('INX H'-count) 



Is called from CPRO each time code shall be 
generated to load HL with an operand. 

Input is INFO from the special codebyte entry 
where CPRO is called. This INFO, and the LOC/value 
of the operand (OPN) to be output, are compared 
with INFO/LOC/value in CURHL. 

'Normal output' means that the full loadsequence 

has to be output. 

'Spec.output' means that output is not necessary, 

or can be reduced to 'INX H's or 
'OCX H's. 

The testing then goes as follows: 

i 

1. If INFO =/= CURHL.INFO, normal output. 

2. If OPN.LOC =/= CURHL.LOC, normal output. 

3. DIFF = CURHL.VALUE - OPN.VALUE 

4. If INFO is based & DIFF =/= o, normal output. 

5. If DIFF + (CURHL+4)> INFO.bytecount, 

(addresses too far apart), normal output. 

6 . Special output: 

Generate DIFF + (CURHL+4) 

'DCX H's or 'INX H's, depending on sign of 
DIFF + (CURHL +4). 

Normal output is dropped. 
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INCCURHL: 

Increments CURHL + 4. 

Is called whenever an 'INX H' is generated. 


CLRCURHL: 

Sets CURHL to zero. 

Is called when code is generated to change HL, 
apart from the special codebyte entries that 
load HL, and-'INX H's. 


PUTIH: 

Outputs a byte with PUTI and calls CLRCURHL. 
Used for single-byte outputs such as 'XCHG', 
etc., that will destroy HL. 


'POP H' 



76 


7. PL/M statement routines 

As explained in section 3, there exists in principle one 
semantic routine for each non-terminal symbol in the PL/M 
grammar. Each semantic routine is responsible for scanning 
the source program, for generating the necessary object code, 
and for recovering from syntactical errors. Certain routines 
place semantic information into the symbol stack 
(DECLARESTATEMENT, PROCEDUREDEFINITION, LABELDEFINITION), 
whilst others extract semantic information from the symbol 
stack (e.g. EXPRESSION, ASSIGNMENT, PROCEDURECALL). 


7.1 STATEMENT 

A PL/M program is compiled by repeatedly calling the subroutine 
STATEMENT; 


< program>:;= [ < statement>;]... EOF 


In general, STATEMENT can determine which statement type is to 
follow (i.e. which of the alternative right-hand sides of the 
grammatical rule for <statement> is applicable) on the basis of 
the first basic symbol of the statement: 


< statement>:: = 


DECLARE... 

(DECLARESTATEMENT) 

DO. . . 

(GROUPSTATEMENT) 

IF. . . 

(IFSTATEMENT) 

GOTO. . . 

(GOTOSTATEMENT) 

CALL... 

(CALLSTATEMENT) 

RETURN... 

(RETURNSTATEMENT) 

HALT. . . 

(HALTSTATEMENT) 

ENABLE... 

(ENABLESTATEMENT) 

DISABLE... 

(DISABLESTATEMENT) 


In these straightforward cases the corresponding semantic 
routine (the name in parentheses above) is called to scan the 
statement. 
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A difficulty arises wbere a statement begins with an identifier 
instead of a reserved word; 

<statement>::= <variable>[,<variable>]... = <expression> (ASSIGNMENT 
identifier: (LABELDEFINITION 

identifier: PROCEDURE... (PROCEDUREDEFINITION 

A one-symbol lookahead (to locate is sufficient to distinguish 

<assignment> from the other two alternatives, which can only be 
resolved by a further symbol lookahead (to locate 'PROCEDURE'). 
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7.2 DECLARESTATEMENT 

<declarestatement>::= DECLARE <decl element>[,<decl element>]... 

■xdecl element>;:= <type declaration> 

<identifier LITERALLY string> 

<identifier>DATA {<constant list>) 

<structure declaration> 

<type declaration>::= <identifier list>[(numbers)] 

<type>[ initial(< constant list>) 3 

<identifier list>::= <identifier> 

<identifier>[,<identifier^3...) 

<type>::= BYTE 

ADDRESS 

LABEL 

<structure declaration> ::= <identifier specification> 

<dimension> <structure type > 
<structure type> ; : = BY <identifier> 

[structure (<member element> 

[, <member element> 3. . .) 

<member element> ::=<variable name> 

[<dimens ion> 3 <member type> 

< member type> : := BYTE 

1 ADDRESS 

1 < structure type> 

<dimension> (<number>) 

The primary function of DECLARESTATEMENT is to insert names and 
associated semantic information (type, address) into the symbol 
stack (cf. section 4). 




- 79 - 


No space is allocated in the object program for explicit label 
declarations (DECLARE... LABEL) or macro strings (DECLARE... 
LITERALLY...). For explicit label declarations, an undefined 
label address is placed into the symbol stack entry. Macro 
strings are placed into a separate string pool. 

For all other non-based declarations, space is allocated 
in the object program, either in the program code area 
for DATA declarations, or in the variable area for 
non-label type declarations. 

The constant values in a DATA declaration are placed into the 
program code area preceded by a jump instruction to skip over 
the non-executable values. For type declarations with the 
INITIAL attribute, the initial values are placed consecutively 
into the space allocated for the associated <identifier list>. 
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7.3 LABELDEFINITION 

<labeldefinition>:;= <identifier>: 

As described in Section 4, the scanner automatically enters 
all previously unknown names into the symbol stack. So 
after having scanned ';' (which distinguishes <labeldefinition> 
from <assignment>), the label name must necessarily exist 
in the symbol stack. If the label was previously defined 
at this block level, then the label definition is an 
erroneous redeclaration. Otherwise, the current program 
address is inserted into the symbol stack entry, and 
prevous references (forward jumps), if there were any, are 
resolved (for further details see section 7.8). 
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7.4 PRQCEDUREDEFINITION 
<procedure definition>::= 

<procedure startxprocedure typexattributes and body> 
END [<identifier>] 


<procedure start>::= 

<identifier>: PROCEDURE 

<procedure type>::= 

[(parameter list)][<type>]| INTERRUPT <number> 

<attributes and body>::= 

[PUBLIC ]j <statementlist>| EXTERNAL [<address>];[<specifications>] 

# 

<type>::= BYTE|ADDRESS 

<statementlist>::=<statement>;|<statement>;<statement list> 
<address>:;= <absolute address constant> 

< number>::= l|2|3|4|5|7 

<specifications>::= <specification>;| 

<specif ication>.; <specif ications> 

<specification>::= <parameter declarestatement> 

<parameter list>::= (<parameter tail> 

<parameter tail>::= <identifier>) 

<identifier>, <parameter tail> 


This routine processes both "normal" procedures and interrupt 
procedures, both "internal" and external procedures and public 
procedures. 

Normal procedures may have parameters, but not interrupt 
procedures; interrupt procedures must have an interrupt number. 
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The code for the procedure, if any, is placed in the program 
segment, and a jump around it is generated. The internal entry 
to a procedure is the address of the first statement code, and 
all parameter checking has been done at compile time in routine 
PCALL. The external entry, i.e. from outside the program, is 
the address of a piece of code following the actual procedure. 
This code checks the number of parameters, pops the parameter 
values from the stack and stores these, and finally transfers 
to the internal entry. 

The routine enters a description of the procedure into the 
symbol table. Inside a procedure the formal parameters also 
have a complete description in the symbol table. However, 
after END the parameter descriptions are compressed: Their 
names vanish, i.e. they get names of length zero, and only 
the semantics remain (altogether 7 bytes). This enables the 
compiler to convert the actual parameter expressions to the 
formal type, and to store them into the memory cells of the 
formal parameters. Note, no diagnostics are given if the actual 
and the formal type do not agree. Parameter compression is done 
in the routine COMPPAR, and the generation of the external entry 
by routine GENXE. The processing of a parameter list is performed 
in routine PARLIST. The identifiers of the parameter list are 
entered into the symbol table with type tparam, i.e. as un¬ 
specified parameters. Declare statement, when processing a 
specification of such a parameter will then change the symbol 
table entry to whatever is specified. 

All internal procedures must at least have one statement 
(may be dummy). If there is none, an error message is given, 
but the compiler will nonetheless accept the END. 

All parameters must be specified, and they must be specified 
before use. The routine will give error messages for any un¬ 
specified parameters at END. This is also the case for 
external procedures. 
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Type procedures must contain at least one RETURN statement. 

The procedure definition routine will give an error message 
at END, if it is not the case. 

If a procedure has the PUBLIC attribute, it may be used 
throughout the remainder of the program, even if declared 
at an inner level. A procedure with the same name and the 
EXTERNAL attribute given will be matched with the associated 
PUBLIC procedure (if present), and thus defines a local usage 
of a public procedure defined elsewhere. This may be used 
to save symbol table requirements, and also for syntax-checking 
of parts of a program. 

External procedures may have an external entry address 
specified. 

In which case the compiler will not try to match any public 
procedures, but the match is supposed to be found in a 
separate compiled program. 

There are five flags for the procedure in the symbol table 
entry: 

metreturn: 1 = if a return statement has been en¬ 
countered in the procedure, 0 = otherwise. 

public: 1 = attribute PUBLIC was recognized for 
the procedure, 0 = otherwise. 

external: 1 = external procedure, 0 = internal. 

address given: 1 = there is an (program) address 
for the procedure in the symbol table entry, 0 = 
external to be matched if external bit above is set. 


finished: 1 = the procedure definition is complete. 
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The flags 'address given' and 'finished' are used by the 
routine PCALL to push the parameter values instead of 
storing them, and to detect illegal recursive calls, 
respectively. 

When the END is met of a no-type procedure, an implicit 
RETURN is generated; however, for type procedures no value 
is returned here. If the last statement before END was a 
return statement, then the implicit RETURN is not generated. 

Every procedure is a block. Thus the routine enters a block- 
marker with type 'tblproc' at the begin and removes it at END. 

All procedures have marked one push in the <various> field of 
symbol table entry (cf. section 4.4) representing the return 
address on the stack, so that any GOTO <outermost block label> 
may generate the necessary POP instructions. 

Code generated for procedure definition: 

a) External procedure with address given or external with 
match to public: 

no code produced 

b) , External procedure without address given and without 

matching public: 

JMP 

<entry> JMP <address to matching publio 

The last JMP is fixed up when matching public is found. 
Parameters are allocated space. 
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c) Procedure definition, not interrupt and not public; 


JMP 


<entry> - 


[ RET] 


statements 


d) Public procedure definition; 


JMP 


<internal e 


ntry > 


statements 


[ RET] 

If UTILITY-option is in effect: 

< external entry> CPI < no. of formal parameters>. 


MVI A,1 
CNZ MONERR 


ERROR 01 


byte parameter address parameter 

POP H POP H 

MOV A,L SHLD <parameter address> 

STA <parameter address > 

JMP <internal entry > 

- 


e) Interrupt procedure definition: 


JMP < entry> 

(at location OBE8 + 3 interrupt number) 


< entry>- 


LXI H, < entry > 

SHLD OBE9 + 3 interrupt number 
JMP - 


statements 


[RET ] 
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7.5 EXTERNAL definition 

<external definition>:;= <external specification>[<input list>] 

[<return specification>]; 

<external specification>= <label definition> EXTERNAL <number> 
<return specification>= <type><register name> 

<input list>::= <input headxregister name>) 

<input head>;:= { 

l<input headxregister name> , 

<register name>::= A l-B ] C ] D | E | H | L | BC | DE [ HL 


An external definition makes it possible to call assembly 
procedures residing at absolute address from a PL/MYCRO 
program. A call on such a procedure is identical to a call 
on a standard PL/M procedure. After the call all condition- 
flags are as set by the called assembly procedure. 

The external procedure is given an internal PL/M-name just 
like ordinary PL/M procedures. The absolute location of the 
assembly procedure is given as a number following EXTERNAL. 

The assembly routine must have all .its input parameters 
transferred in the registers. Furthermore, only one value 
may be returned from the assembly procedure (this is a 
PL/MYCRO restriction). 

Input parameters are specified by giving the list of registers 
which are to accept the actual parameters enclosed in 
parentheses, while a function is specified by giving the type 
(BYTE or ADDRESS) and the name of the register containing 
the return value. 
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The format is illustrated by the external specification of 
the MYCRO monitor function as given below. 


/* INTERFACE TO MONITOR FUNCTIONS */ 

MON: EXTERNAL i|0H; 

MONERR: EXTERNAL (A) ; 

TTSTAT: EXTERNAL 46H BYTE A; 

CRLF: EXTERNAL il9H; 

TTCON: EXTERNAL l|CH (BC); 

TTCLF: EXTERNAL ^FH (BC); 

TTI: EXTERNAL 52H BYTE A; 

TTIO: EXTERNAL 55H. BYTE A; 

TTO: EXTERNAL 58H (A); 

DREG: EXTERNAL 5BH; 

DMEM: EXTERNAL 5EH (BC,HL); 

OUTHX: EXTERNAL 61H (A) ; 

OUTH2: EXTERNAL 6i^H (BC); 

INHEX: EXTERNAL 67H BYTE A; 

INHX2: EXTERNAL 6AH ADDRESS BC; 

LOAD: EXTERNAL 6DH (A,BC) ADDRESS HL; 

LINK: EXTERNAL 88H (A,BC) BYTE A; 

NIBBLE: EXTERNAL 8BH (A) BYTE A; 

MOVE: EXTERNAL 8EH (BC,HL,DE); 

COMP: EXTERNAL 91H (BC,HL,DE); 

FILL: EXTERNAL 9^H (BC,HL,A); 

LPO: EXTERNAL 97H (A) ; 


/* INTERFACE TO BASIC DISKETTE SERVICE ROUTINES »/ 

BDO: EXTERNAL 70H (A) BYTE A; 

BRDR: EXTERNAL 73H (A,B,C,HL) BYTE A; 

BWDR: EXTERNAL 76H (A,B,C,HL) BYTE A; 

BDDR: EXTERNAL 79H (A,B,C,HL) BYTE A; 

BDC: EXTERNAL 7CH (A) BYTE A; 

BDD: EXTERNAL 7FH (A) BYTE A ;■ 

BLDR: EXTERNAL 82H (A,B,C,HL) BYTE A; 

BUDR: EXTERNAL 85H (A,B,C,HL) BYTE A; 
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The following code is generated for an EXTERNAL definition: 


JMP LI 

<load actual parameters to specified registers> 

CALL XX 

<place return value in A/DE> | JMP XX 
RET 


LI: 


For a typeless EXTERNAL, a simple jump to the procedure code 
is generated, otherwise the code is called and the result value 
loaded into A/DE on return. 

Example : 


TTCLF: EXTERNAL 4FH(BC)j 
TTCON: EXTERNAL 4CH(BC); 

CRLF : EXTERNAL 49H 
OUTHX: EXTERNAL 61H(A); 

DECLARE A BYTE INITIAL (1); 

CALL TTCLF (.('INTERNAL CHARACTER VALUES',0)); 
DO WHILE A <>0; 

A = INCHAR; 

CALL TTCON (.(' = ',0)); 

CALL OUTHX(A); 

CALL CRLF; 

END; 

EOF 
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7.6 GRQUPSTATEMENT 

<groupstatement>::= <group head>;[<statement>;]... END[<identifier> ] 

<group head>::= DO 

DO WHILE <expression> 

DO CASE <expresslon> 

DO <identifier> = <expression>TO<expression> 

[BY<expression>] 

A group statement is considered as a block in a PL/M program, 
within which identifiers’can be redeclared. On entering the 
routine GROUPSTATEMENT, therefore, a block marker is placed 
onto the symbol stack to designate a new block level - the marker 
is removed from the symbol stack on exit from 'GROUPSTATEMENT', 
together with any names that have been introduced within the 
block (procedures NEWBLOCK and ENDBLOCK respectively). 

For a simple group statement (DO;...END), the only other actions 
are repeated calls on STATEMENT, to scan the body of the group 
and generate the corresponding code, until END is detected. 

The three other forms of group statement generate the following 
code, either to send control to a particular statement of the 
group body (DO CASE...) or to evaluate and test a looping con¬ 
dition, and to re-execute the loop if necessary: 

DO WHILE... 

(generate code to 
evaluate and test <expression> 


LOOP: <LOGEXP> 

jump-on-false EXIT 

^0 

"l 

jump LOOP 


EXIT: 
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DO CASE... 

push EXIT 
<ADREXP> 

^ 0 < DE < n then 

jump (TABLE(DE)) 
ret 


ret 

LI: 

ret 


Ln: S 

n 

ret 

TABLE; L^ 



EXIT: 


(generate code to 
evaluate <expression> 
to DE) 

(extract statement 
address from table and 
jump to it) 


Sq-S^ comprise body 
of do-group. 

J 



Statement address table 


During compilation of a DO-CASE statement, the addresses Lq...L^ 
are maintained in the forward reference pool, in the same manner 
as forward GOTO references (see section 7.8). 
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DO <identifier> = <expression>TO <expression >2 [BY<expression >2 ] 
Code to perforin the following operations is generated: 


<identifier> ■<- <expression>^ 


(generate code to 
evaluate <expression>^ 
and assign result to 
<identifier>) 


LOOP: <identifier><=<expression >2 


(generate code to 
evaluate <expression >2 
and compare with 
<identifier>) 


jump-on-false EXIT 



body of do-group 


<identif ier>-<-< identif ier>+ 


l<expression>_ 

1 ( 


(generate code for 
<expression >2 if 
present, and code 
to perform additio 
or increment) 


j ump LOOP 


EXIT: 

The actual code generated depends upon the type of the left-hand 
side of the assignment (the controlled variable), and whether a 
limit is specified (BY option) or implied, as follows: 
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DO. . 


LI: 


L2: 


DO <byte> = . 
TO<expr>: DO.. 

<bytexp> 

MOV M,A 


<byteexp> 
POP H 
SUB M 


JMP L3 
<bytexp > 


. . DO <address> = ... 

. BY<expr> DO... TO<expr> DO... BY<expr> 

<adrexp> 

( LXI H,- 
I LHLD- j 


MOV M,E 
INX H 
MOV M,D 
DCX H 


PUSH H 

<adrexp> 

POP H 

MOV A,E 
SUB M 
INX H 
MOV A,D 
SBB M 

JC exit 
PUSH H 

JMP L3 
<adrexp> 



RET 


RET 
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L3: <body of DO loop> 


POP H 



INK M 

CALL L2 


JNZ LI 

POP H 

DCX H 


ADD M 

INK M 


MOV M,A 

JNZ LI 


JNC LI 




INX H 



INR M 

exit: 

- 

JNZ Ll-1 


CALL L2 
POP H 
DCX H 
MOV A,M 
ADD E 
MOV M,A 
INX H 
MOV A,M 
ADC D 
MOV M,A 
JNC Ll-1 
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7,7 IFSTATEMENT 

<ifstatement>:;= IF<expression>THEN<statement> 

[; ELSE<statement> ] 

The following code is generated for an if-statement: 

<LOGEXP> (generate code to evaluate 

and test <expression>) 

jump-on-false Ll 

(code for the true part <statement>) 

1 jump L2 

I Ll: Sp (code for false part if any) 

Ll: ] L2: 


no ELSE specified 


ELSE specified 
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7.8 GOTQSTATEMENT 

<gotostatement>::= GOTO <identifier> 

GOTO <number> 

If the destination address of the GOTO is a fixed numeric address, 
then a simple jump instruction is issued: 

JMP <number> 

If the destination address is specified as a non-label variable 
name (which must be non-based and of address type), then the 
address is extracted from the variable and a jump issued as follows: 

LHLD <identifier> 

PCHL 

If the destination address is specified as a defined label 
identifier, then the address is extracted from the symbol stack 
entry and a direct jump issued: 

JMP <label address> 

If the label is undefined, then it must have been introduced 
into the symbol stack either explicitly by a DECLARE statement, 
or implicitly by scanning either this occurrence of the identi¬ 
fier, or a similar occurence in a previous GOTO. A dummy jump 
is issued, and the current program address is appended to a 
chain of fixup addresses associated with this identifier's 
symbol stack entry: 

L: jump 0 (and current program address L is 

appended to fixup chain for this 
<identifier>) 
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Fix up chains are maintained in a separate forward reference 
pool. This pool is initially structured by the procedure 
DRQNPOOL into a single chair of two-address links: 


POOLTOP 



POOLTOP points to the start of the free link chain, which is 
terminated by a zero link. 

A GOTO statement to an undefined label destination will cause 
the procedure GETLINK to remove a link from the free link chain, 
and append this to the labels forward reference chain. Consider, 
for example, the program segment: 

51. GOTO LI 

52. GOTO L2 

53. GOTO L2 

S4 . 

S5. L2; 


GOTO 


LI 
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The last GOTO in this program segment (S4) will cause the 
following changes to the forward reference pool. 


Before S4 


LI 


T - , 

0 

! 

fixup for 
SI 

Xj ^ '- 1 ^ 

c: 

PonT.T^op ^ 

--- 

fixup for 
S2. 

0 

fixup for 
- S3 

ir X v-^ir 

< 



_ 


After S4 



The label definition S5 will cause previous references to L2 to 
be fixed up, and the forward reference chain for L2 will,be re- 
t^fned to the forward reference pool (by procedure RELLINK) as 
follows: 
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7.9 CALLSTATEMENT 

<callstatement>::= CALL <identifier> [(<expr>[,<expr>]...)] 

A call statement is the only way of invoking a typeless 
procedure (a procedure that does not return a value). 

The <identifier> must be a procedure name, and the number 
of <expr> must be equal to the number of formal parameters 
to that procedure. 

The procedure name may be user-defined or predeclared 
(e.g. TIME). 

If the procedure is not typeless, the routine will give a 
warning message, but will accept it. 

For typeless declared procedures the routine PCALL is called 
to process the call and generate to following code: 

<evaluate parameter expression> 

<store value> see below 

CALL <procedure entry> 

<procedure-entry> is the <entry> for an internal procedure 
(see section 7.4), or the <address> specified for an external 
procedure (section 7.5). 

Parameters are passed as follows: 

a) formal byte parameter to internal procedure: 

<BYTEXP> (evaluate expr. to A) 

STA address of formal parameter 
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b) formal address parameter to internal procedure: 

<ADREXP> (evaluate expr. to DE) 

XCHG 

SHLD <address of formal parameter> 

c) parameter to external PL/MYCRO procedure: 

<ADREXP> (evaluate expr. to DE) 

PUSH D (NBl the last parameter is not pushed) 

d) parameter to external assembly routine, 

single register: double register: 

<BYTEXP> <ADREXP> 

PUSH PSW PUSH D 

Note that the last parameter is not pushed. 


For type-procedures (not standard form of PL/M CALL statement) 
the routine IDPRIM is called (section 5.2). 

For TIME, see section 5.5. 

For OUTPUT (not standard form of PL/M call statement) the code 
produced is: 

OUT <byte constant> 


i 
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7.10 RETURNSTATEMENT 

<returnstatement>: : = RETURN [<expr>] 

This routine searches the block markers in the symbol 
table in decreasing order of block level, starting with 
the current block level. 

For every DO-block found during the search the necessary 
POP-instructions are generated. 

When a procedure (unfinished) is found, then for type- 
procedures the type-corresponding expression routine is 
called, which will generate the proper loading of the 
return register (A for byte, DE for address), and finally 
"RET" is generated. The metreturn-flag of the procedure 
is set. 

* 

If no procedure is found, i.e. the search reaches block- 
level zero, then an error message is given and a recovery 
scan of input is made to the next semicolon. 

Generated code: 

only if RETURN is 
enclosed by DO...END'S 
within procedure body 


as above 

<evaluate expr to A/DE for byte/address values> 
RET 


a) RETURN 


POP H 


POP H 
RET 


b) RETURN <expr> 


POP H 


POP H 
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7.11 HALTSTATEMENT 

<haltstatement>::= HALT 

The "HLT" instruction is output. 

7.12 ENABLE STATEMENT 

<enablestatement>::= ENABLE 

The "El" instruction is output. 

7.13 DISABLESTATEMENT 

<disablestateinent> : : = DISABLE 
The "DI" instruction is output. 
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8. Input/Output 

The functions of the input/output routines in the PL/MYCRO 
compiler are mainly: 

to deliver characters one-by-one from the source text 

to list the source text on a specified device, 
together with error messages, etc. 

to write out the generated object code. 

The input/output routines may be conveniently classified as 
general i/o routines and special PL/MYCRO i/o routines. The 
first are, with very minor modifications, identical to those 
used in the MYCRA assembler, and will therefore not be further 
described here. 

The special PL/MYCRO routines handle: ' 

source program input 
listing output 
object code output 


8.1 Source program input 

The scanner obtains source input characters one-by-one by 
calling the routine INCHAR. Whenever the symbol CR/LF is 
encountered, this is replaced by the space character, and 
a new source record is read by a call on the procedure RDLINE. 

RDLINE has two functions, namely to read the next source record 
(through the system routine INPUT) and to trigger a printout 
of the source record by calling PRLINE - if a source listing 
is required. 
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8.2 Listing output 

Lines of listing output, containing source text, assembler 
listing, error/warning messages, etc., are built up in a 
local buffer which is emptied when full or when output is 
forced. If no listing device is specified, diagnostics only 
are output to TTY. 

PRLINE is called to print a source record with its ordinal 
line number, with indentation according to the current block 
level. Whilst PRLINE performs no output if no listing is 
specified, FOLINE in this case outputs the source line to 
TTY in an identical manner to PRLINE. 

PRTC is the basic listing output routine. This inserts one 
character into the listing buffer, and empties this when 
full. A number of utility routines for printing strings, or 
repeated characters, also make use of PRTC. 

8.3 Object program output 
8.3.1 Object program format 

The object code produced by the PL/MYCRO compiler is allocated 
to one of four address spaces (code segments) - program code, 
literals, data or memory (PSEG/LSEG/DSEG/MSEG). For relocatable 
code output, it would be the task of a linker program to map the 
relative address of each code record within each segment into an 
absolute address. 

The current version of the compiler produces only non-relocatable 
code, so that the information needed to perform this mapping must 
be specified to the compiler in the form of input parameters (see 
section 9 for details). Thus the user must specify where his 
program is to reside in memory at compilation time, rather than 
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postponing this decision until load (execution) time. These 
parameters specify then the absolute addresses where the 
various code segments are to begin. They are actually the 
initial values of the table LOCTAB, which contains the 
location counters for the different code segments. These 
location counters are therefore absolute addresses, and not 
relative to each segment start. For simplification, program 
code, literals and read-only data (as specified in DATA 
declarations) are all placed into the same segment, which 
would normally be allocated to a ROM area. Data and memory 
segments (representing the variable pool and the MEMORY 
vector) must be allocated to RAM areas. 


8.3.2 Literal handling 

Literals and read-only data are placed in the program code 
segment preceeded by a jump around the data byte sequence. 

In some cases, the destination address of this jump can be 
computed before the literal has been scanned, but in many 
cases it cannot, so that a dummy jump address must be 
output, and subsequently fixed up when the literal end is 
encountered. 

The following routines are used: 

LITERAL: outputs a literal to the program code segment, 

preceeded by a jump around the literal values - the 
scope of the jump can be pre-computed. Calls MOVNUM 
to output numeric literals and MOVSTR to output 
string literals. 

PUTLST: scans and outputs to program code segment a 

<constant list>, either from a DATA declaration 
or a literal of the form .(<constant list>). As 
the list length is unknown on entry, a dummy 
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jump around is issued, which is fixed up on en¬ 
countering the list terminator The list is 

scanned and the values output by calls to MOVNUM 
or MOVSTR as appropriate. 

MOVNUM; places a single or double byte value in the program 
code segment (via EMIT). 

MOVSTR: places a string constant into the program code 

segment (via EMIT). 


8.3.3 Output of object code - EMIT 

The present compiler produces only non-relocatable object code 
in standard Int®l format. With minor modifications, relocatable 
code can be output in an extended Intel format. Both the 
standard and extended object code formats are described in 
Appendix A. 

The routine EMIT builds up type 0 object code records con¬ 
taining the following information: 

length - number of bytes in the data field of this record 

address - absolute memory address from where data is to 
be stored 

type - 0 

data - object code bytes; each 8 bits of code is output 
as two ASCII graphic characters. 

Input to EMIT is a single byte of object code, and an indication 
of which code segment this byte is to be placed into. EMIT builds 
up a local buffer, using a series of local status variables: 

XNUMB - data byte count 

XADRS - next free buffer location 

XSEG - current code segment 

XNEXT - expected location within current segment. 
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XNEXT serves as a local location counter within the segment 
XSEG (although it is actually an absolute address, not an 
offset). XNEXT is incremented each time a byte is placed 
into the current code segment. An attempt to output a code 
byte to another segment (^^XSEG) , or at an unexpected address 
within the current segment (f^XNEXT) , will force the object 
code record to be ouput to the object code file (by a call 
on the system routine OUTSEG). A new record is then begun 
with the desired XSEG, and the initial value of XNEXT is 
taken as the current location counter of XSEG. Records are 
of course also output when the code buffer nears overflow. 

This technique allows the problem of fixing up object code 
addresses to be passed over to the loader program. It is 
sufficient to break off a code record and begin a new con¬ 
taining the fixup information as data, and the address where 
this is to be placed as the header address field. Note, however, 
that this assumes a loader which processes these object code 
records in a single forward scan (such as the Intel or MYCRON 
loaders). 


8.3.4 Utility routines 

A number of utility routines exist for the output of object 
code bytes. They are used extensively throughout the compiler; 

PUTI: outputs a single byte of object code to the program 

segment (PSEG) at the next instruction address, which 
is then incremented. Calls the coroutine DIASM (see 
below) to disassemble the instruction code byte and 
build up a line of assembler listing, if this option 
has been specified. 
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DISASM: 


PUTADR: 


in principle a coroutine. The initial entry is with 
an instruction byte. The instruction mnemonic is 
output to the listing file, and the instruction byte 
is decoded as a 1-byte, 2-byte or 3-byte instruction. 
For a 1-byte instruction, the next call on DISASM 
will repeat this cycle. For 2-byte instructions, the 
next call on DISASM will interpret the input byte as 
an immediate operand value and list this. A subsequent 
call will repeat the cycle. For.a 3-byte instruction, 
the next two.calls on DISASM will interpret the input 
values as low and high address bytes respectively, 
and list them as such. A subsequent call will repeat 
the cycle. 

causes a two-byte address to be output starting at 
the current location in the specified segment. 

causes the current program code address to be ouput 
starting at the specified location in the program 
segment. (This implies terminating the current object 
code record.) 


FIXADR; 
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Appendix A ; Proposed relocatable object code format 

The basic relocatable object format is an extension of the 
load format defined for Intel 8080. 

The general format is: 


Byte no: 

0 

1 2 

3 6 

7 8 

9 

N 

N+1 N+2 


: 

length 

address 

record 

type 

data 

Check 

sum 



2 « length 


N = 9 + (2*length - 1) 

length is the number of the bytes in the data area. 

As each byte is represented by two ASCII characters, 
this means that the data area in the record is 2*length. 
(Normally, the maximum length used is 16). 

address is the absolute (or relative) address where the first 
byte of the data information should be stored. In some 
records of the extended format (02, 06) it gives the 
ordinal number of the first symbol (or location counter 
value for type 06) in the data field. 

record type specifies the type of record. The load format 
for Intel 8080 only uses records of type 00. The new 
record types introduced by the extended format are 
detailed below. 

data contains either the data to be loaded at the specified 
absolute or relative address (types 00, 01), or the 
symbols to be defined as external or entry (02,03), 
or relocation information (04 or 05) or maximum location 
counter values (06). 
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checksum is the checksum for the record. 

The relocatable object format defines several other record 
types, and the meaning of the data field is interpreted 
according to the record type. 

Presently, the following record types are defined: 

00 - absolute (load) record or EOF record. 

01 - relocatable code record 

02 - external reference record 

03 - entry point definition record 

04 - address relocation record 

05 - byte relocation record 

06 - location counter length record 

01 - relocatable code record 


The address part is interpreted as follows: 

character no 

3 location counter number 

4-6 relative address within location counter 

The data field is relocatable code to go into the specified 
address and subsequent addresses. 

02 - external reference record 


The address part defines the relocation number for the first 
of the external symbols mentioned in this record. The 
corresponding number is used in the relocation records (04) 
to indicate relocation by an external symbol. The lowest 
number is 10 hex. The data field gives 10 ASCII characters 
for each external reference (numerical representation of 
characters). 
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03 - entry point definition record 

Address part as type 01 giving the address of the entry point. 

The data field contains 10 ASCII characters (numerical 
representation of characters). 

04 - address relocation record 

The address part is ignored. 

The data field contains a number of 8 character entries: 
loccounter (2) and address (4) relocby (2) 

where 

'loccounter and address' determines the location to 
be relocated. 'reloc by' 

Indicates which value should be added (location counter 1-F, 
or relocation by an external symbol (external symbol number). 

05 - byte relocation record 

Similar to 04 record but only one byte is relocated. The lower 
order 8-bits of the address part of the corresponding entry 
point is added to the specified location. 

06 - location counter length record 


This record defines the length of the code under each location 
counter (four ASCII characters per counter) in the data area. 

The last character of the address field gives the first location 
counter number in the list. 




in 


The typical sequence of a relocatable object file produced 
by the cross assembler will be: 

records defining external names. 

records with intervening 04 and 05 records for 
relocation, 03 records for entry point ? 

definition and 00 records for absolute code. I 

record(s) 

I 

record (EOF) 

|| 

This will be called a linkable object file. Such object | 

files may be used as input to the linker program to produce ^ 

a loadable object file. J 

I 

i] 

A loadable object file may be relocatable or absolute ♦ 

An absolute loadable object file consists only of 00 records, 
while a relocatable loadable object file consists of: 

records with intervening 04, and 05 records 
for relocation 

record 

record (EOF) 

In a relocatable loadable file, only one location counter I 

* i 

is used. j 


00 and 01 

06 

00 


02 

01 

06 

00 






